How to parse and extract data from Gmail to Google Sheets | blog.gsmart.in

Extracting data from Gmail messages and then routing that data to other automation steps is an easy task to do using Google Apps Script.

My bank sends transaction confirmation emails for every transaction on my account. I will analyze and collect the expense data in a google sheet. At the end of the month, I will have a nice summary of all the expenses organized in a google sheet ready to process and analyze.

With a little customization for your own email messages, you can automate this process for your own requirements.

create a standalone web app in google apps script

For this project, we need to create a standalone web app in Google Apps Script. Go to https://script.google.com. Create a new script. Give a name to the script.

filter emails

the search feature in gmail is quite powerful and you can get to the right emails using the gmail search operators. read more about gmail search operators in this article.

You can use search operators to filter and retrieve transaction emails. For this project, we only need the transaction confirmation emails and not the marketing emails from the bank.

This function searches for messages from “citicorp.com” and with emails that have a subject containing “transaction confirmation…”. so the search function becomes:

Note that I am passing the second and third parameters to the search() function. this is to limit the number of messages returned and make the script run faster. if there are too many messages matching the criteria, the search will take a long time to retrieve all the messages. setting a limit makes it faster.

See Also:  How to download pictures from yahoo mail to computer

let us display the messages returned by the search.

First, add an html template. do: file → new and create a new html file. let’s call it “messages.html”

now, here is the script to get and display the messages:

You can customize the search filter so you get the right emails.

make file → save.

go to menu item: publish → deploy as web app.

then provide the permissions. at the end of the stream, it will give you the link to open the script in the browser. open the link. (click the link to get the latest development version)

the web page should display the messages.

parse the message

The next step is to analyze and extract the data that interests us from the message. regular expression is one of the readily available features in such situations.

there is a tool that will facilitate the creation of the regular expression. it’s regexr.com.

parse using regular expressions

parse using regular expression

copy and paste the message text into the tool’s text area, and then compose the regular expression step by step. the tool will show the matches live. this makes building the regular expression easier.

once the regular expression is ready, parsing the data is easy.

let us verify that the analysis actually gets the correct data.

create another html file and name it parsed.html. we will send the parsed records to the html template and display it.

See Also:  How to Mark All Emails as Read in Gmail

here is the script:

note: I did not parse the date into a date() object. this is to keep the code simple. only a few extra steps will be needed to parse the date.

gmail parsed data

save the data in a google sheet

Now that we’ve extracted the data, the next step is to save the data to a google sheet.

The steps are quite simple. open the spreadsheet, then add the rows to the sheet.

prevent the same message from being processed over and over

if you call processtransactionemails() multiple times, it will keep adding the same rows multiple times.

One way to prevent that from happening is to add a tag to processed emails.

we add the payment_processing_done tag to the messages that are processed. notice we updated the search filter -label:payment_processing_done .

this means receiving the messages, but without the “payment_processing_done” tag.

process emails as they arrive at gmail

To fully automate this script should keep running whenever there is a new email. however, there is no easy way to trigger the script when a new email arrives.

The other alternative is to trigger this script with a timer. Let’s run this script every hour.

For each run, we need to check the emails received in the last hour. however, there is no direct search filter for that.

this gmail search: newer_than:1d will receive emails in the last day. combine it with the rest of the operators and we have a narrow enough filter.

See Also:  Chrome sync keeps pausing and asking to sign in

then the updated filter becomes:

timer trigger

The next step is to trigger this script every 1 hour. we have to add a time trigger. go to menu item: edit → current project triggers

click the new activation button.

then create a time trigger. select the function “process transaction emails” for “the function to execute”.

gmail check email timer

see also

  • get full source code here
  • from google sheets, how to send email based on date
  • gmail search operators and examples (download in pdf)

Leave a Reply

Your email address will not be published. Required fields are marked *