This article provides you information on how Emarsys opens connection to your database in order to generate and run queries to get the personalized content for email campaigns with Relational Data.
When does Emarsys open connection to your database?
Emarsys establishes connection to your database with Relational Data Service (RDS) when you either
- utilize Relational Data features, eg.: launching email campaigns with Relational Data personalization or using an Relational Data segment in an Automation Center program,
- browse the User Interface (UI) of the Emarsys Marketing Platform to collect metadata, or
- import data using the Relational Data API.
Various connection pools
Relational Data uses separate connection pools to avoid interference between segmentation, personalization and data import. This logic ensures that a slow segment running cannot block data import through the Relational Data API.
Open connection time: Emarsys attempts to close the connections as fast as possible, if a connection pool has not been used for a while.
Connection timeout: All connectors have a configured 5 seconds connection timeout.
Exception: Amazon Redshift has 15 seconds timeout.
How to generate and run queries for personalization?
Once the connection has been established to your database Emarsys can generate and run queries to get personalized content.
Use case example
To walk you through how to run queries we use a real-life use case and its database for our example.
An airline company wants to send special upsell offers with recommendations in personalized email campaigns to their contacts according to the destination city and date of arrival. The upsell email campaign is sent with personalized offers for services to buy, such as renting cars or booking hotels at discount prices in the destination city, before the date of arrival.
Example for upsell email content:
Based on the contact information we determine to which city and at what time the contact is traveling to, furthermore which service is available for him at the best price in the local currency, for the proper service duration at that city, e.g.:
Hi,
We have seen that you will travel to Vienna on the <date_of_arrival>
.
Did you know that you can buy the following services at a discount price?
<service_name>
at <company_name> <price> <currency> <service_duration>
Setting up your database
Relational databases store data in tables, which are made up of named columns and rows of data. The rows are also called relations, hence their name. For our use case example, you have your customers’ bookings data in your bookings table (Columns: contact_id
, booking_id
, date_of_arrival
, destination_city
). The upsell services can be obtained from a view called upsell_services. (Columns: service_type, service_name, company_name, city, price, currency, price_duration, availability
). You also store information about your contact in the built-in Emarsys contact database. With Relational Data Service Emarsys can connect to your own (external) databases (which needs to be pre-configured), for details, click here.
View upsell_services
For this query example we use the view called upsell_services.
service_type | service_name | company_name | city | price | currency | price_duration | availability |
---|---|---|---|---|---|---|---|
rent_a_car | Rent a car | ViaRent | Vienna | 10 | Euro | hour | yes |
rent_a_car | Rent a car | BerliRent | Berlin | 14 | Euro | hour | no |
rent_a_car | Rent a car | KaiRent | Cairo | 6 | Egyptian Pound | hour | yes |
rent_a_car | Rent a car | LDRent | London | 8 | Pound | hour | yes |
accomodation | Accomodation | GreatHotelVienna | Vienna | 130 | Euro | night | yes |
rent_a_bike | Rent a bike | TwoWheelsBerlin | Berlin | 4 | Euro | hour | yes |
rent_a_bike | Rent a bike | TwoWheelsBerlin | Vienna | 2 | Euro | hour | yes |
accomodation | Accommodation | AwesomeHotel | Berlin | 150 | Euro | Night | no |
baggage_insurance | baggage_insurance | InsuranceLTD | General | 12 | Euro | trip | Yes |
Setting up reference fields
In the email message we want to use the value of service_name
based on the value of the city
and its availability
.
In order to do that, we have to set two reference fields, which are city
and availability
. We use these 2 reference fields both together to identify the values for personalization.
ESL
Script to have the name of an available service located in Vienna:rds.myconnection.upsell_services("Vienna","yes")[0].service_name
Query
Emarsys runs the following query to have the value of service_name
where the city
= "Vienna" and the availability
= "yes":
select distinct service_name from upsell_services where (city = "Vienna" AND availability = "yes") limit x
Setting up a default value
In case you expect to have some values not represented in your view and you want to make sure that even in that case you have a content there you can set a default value.
ESLrds.myconnection.upsell_services("Los_Angeles","yes")[0].service_name
We already have the reference fields city
and availability
set up and we added the default value "General" for the city
reference field.
Checking if the default value is needed?
Now, as a default value is set before we run the query to find the personalized content, first we check whether we need to use the default value at all, by running the following query:
Queryselect distinct city from upsell_services where city = "Los_Angeles" limit 1
Result: Empty
If the default value query’s result is empty, Emarsys needs to use the default value instead of the original value.
Queryselect distinct service_name from upsell_services where (city = "General" AND availability = "yes") limit x
Result: Not Empty
If the default value query’s result is not empty, Emarsys runs the query with the original value (not with the default one).
Queryselect distinct service_name from upsell_services where (city = "Los_Angeles" AND availability = "yes") limit x
Personalizing multiple contacts
Generally, Emarsys can batch together and run queries per 500 contacts per personalization source (table or view). However, if the message contains the same source multiple times with different configurations (parameters), then the batch size is multiplied by the number of different configurations in the message and they are grouped together. Each query will contain up to 1,000 parameter groups in the where
condition.
A parameter group contains multiple reference fields for a personalization token.
In this example, the contact.1234
field contains the destination cities for each contact:
rds.myconnection.upsell_services(contact.1234,"yes")[0].service_name
rds.myconnection.upsell_services(contact.1234,"no")[0].service_name
This would be 2x500 parameter group, because even though it is from the same source (view upsell_services) it has 2 different configurations. So, for the 500 contacts Emarsys runs 1 query containing 1000 parameters in the where
condition.
Query for default handlingselect distinct city from upsell_services where (city = "Vienna") OR (city = "Los_Angeles") limit 2
If the default query result is not emptyselect distinct service_name from upsell_services where ((city = "Los_Angeles" AND availability = "yes") OR (city = "Vienna" AND availability = "yes")) limit x
Personalizing multiple parameters
If the message contains the same source multiple times with different configurations or parameters, they get merged together into one query.
ESL
rds.myconnection.upsell_services("Los_Angeles","yes")[0].service_name
rds.myconnection.upsell_services("Los_Angeles","yes")[0].service_type
Query
select distinct service_name, service_type from upsell_services where (city = "Los_Angeles" AND availability = "yes") limit x
If multiple sources are used in the same campaign, they are not merged together to create a single query. Each individual source will have its own individual query.
Limit in personalization queries
We use limit
in every query. When you use the same ESL with different indexes, the limit is increased instead of running the query multiple times.
ESL
rds.myconnection.upsell_services("Los_Angeles","yes")[0].service_name
rds.myconnection.upsell_services("Los_Angeles","yes")[1].service_name
rds.myconnection.upsell_services("Los_Angeles","yes")[2].service_name
Query
select distinct service_name from upsell_services where (city = "Los_Angeles" AND availability = "yes") limit 3
In case of the foreach ESL tag, you can define the limit yourself, however the maximum limit is 100
. If undefined, the limit is 100
by default.
Retrying queries
If multiple ESL or personalization tokens are present in your campaigns with multiple personalization sources (table or view) and at least one times out, all queries get retried not just the faulty slow one.
For example, let's say you have the following in the campaign:
rds.myconnection.upsell_services("Los_Angeles","yes")[0].service_name
rds.myconnection.bookings(contact.1234,"Los_Angeles")[0].date_of_arrival
If one bookings table is slow and fails to return the results within due time, both personalization queries will be retried. As a result, following queries run multiple times in your database :
select distinct service_name from upsell_services where (city = "Los_Angeles" AND availability = "yes") limit x
select distinct date_of_arrival from bookings where (user_id = 2 AND destination_city = "Los_Angeles") limit x
Of course, in case the campaign is sent out to multiple contacts, the conditions can be more complex than the ones described above.
The duration of creating Relational Data personalization queries - FAQ
What is the time limit for Relational Data personalization queries?
For optimal performance, the queries should finish running less than 500 ms.
What can I do when a Relational Data personalization query is too slow?
Optimizing the duration of Relational Data personalization queries is very similar to optimizing relational segment template queries. Creating such well-optimized queries requires deep understanding of the type of database you use, the data you store in your database, the use-case you are trying to create as well as general understanding of query optimization.
We strongly recommend consulting an expert if you experience performance issues.
- When creating a personalization query, always consider the following:
- How large will the result set be?
- How large is the dataset the query scans to determine the results?
- After you considered these, you can decide:
- Whether to use indexes: Depending on the type of database you use, this might be different, but generally you should consider using indexes. Most of the time with larger datasets, indexes can help performance. Also, you should index reference fields.
- Whether to use functions in the where clause: Some functions can increase query time significantly because they can result in unintended table scans. Before using them we recommend to research their performance impact.
- What you really need to include in the query: Avoid to use select *. Instead, include the specific columns that you need individually.
- Whether to use join instead of correlated subqueries: Most of the time when operating with larger datasets, joins have better performance than correlated subqueries.
- How to use joins: Joins can be time-consuming as well. Do not join unused tables, and always try to join on indexed fields. Also, fields on you are joining should have the same data type.
- Whether to use wildcards: Using wildcards will definitely slow down your query, especially for huge tables. Try to avoid it if you can.