You must format your sales data file according to the specifications and guidelines described below.
Before you start
What is sales data?
Sales data is all the information about the transactions registered in your stores, including daily purchases, returns and cancellations, as well as offline data from your physical stores. We store your sales data in our Sales Data Service, through which individual Emarsys apps can access it.
The status of your sales data as well as the regular updates can be monitored on the Management > Predict Data Sources > Sales Data page.
- Sales data uploads are incremental, meaning new uploads add to existing sales data and do not overwrite them.
- As Emarsys currently does not deduplicate sales data, make sure you upload each individual order only once as duplicate records will skew your reporting.
Why is sales data important?
Even though our Web Extend data collection scripts track purchases in your online store, they do not collect any information about returns, cancellations and offline transactions. This information is essential for all the related Emarsys functionality, such as accurate revenue reporting or product affinity models, and helps to make our smart features even smarter.
Your sales data is required by most of the Emarsys products and features, including Predict and Smart Insight. It is also needed for campaign revenue attribution and revenue statistics and reporting.
How much effort do I need to invest in this?
First you will need to generate an initial file, ideally containing at least two years of sales data, that conforms to our specifications, described below. Depending on the size and complexity of your sales history, it may take a few hours for your data specialist to deliver this file.
Then, if our validation engine detects some issues, you will need to fix these before proceeding, which may take another couple of hours of your data specialist’s time.
Based on past experience with a wide sample of customers, here is an indication of how much time you may require to set up your sales data file:
|Generating historical sales data file||Data specialist||Up to 8 hours|
|Validating the file||Data specialist||5 minutes|
|Fixing errors, if applicable||Data specialist||Depends on the number and severity of issues, but usually no more than 2 hours|
|Setting up daily sales data import||Data specialist||Up to 2 hours|
How often do I need to upload sales data?
After making the initial upload of your historical sales data, Emarsys recommends to upload new sales data on a daily basis.
You may upload sales data updates more frequently, but Smart Insight will process these sales data files in an aggregated manner only once a day.
General rules for your data file
Your sales data file upload will fail if it does not pass all our validation checks. You must make sure that the file is correctly formatted as described below.
The columns in the header of your sales data file must be in the order that was agreed during the Smart insight onboarding implementation. If they are not, the upload will fail.
You can check the order of the columns at the top of the Sales Data page (Management menu > Predict Data Sources > Sales Data > Manage sales data), for example:
- Your sales data file should be a simple .csv table, with each row describing the attributes of a single line item.
- Your file must start with a header row (e.g. the field names).
- The file name must follow the convention
sales_items*.csv. In other words, the file name must begin with the string
sales_itemand its extension must be .csv, while you can replace the asterisk with a number of characters of your choice (e.g. a timestamp).
- Use a CSV export library that is compatible with the standard CSV format.
- Use a comma to separate fields in a row.
- When a field does not have any value, leave it empty. Example:
- Do not use newline characters, either unquoted or quoted, in the fields.
- Use UTF-8 encoding.
- Include the field names in the first row (header) and make sure that all the subsequent rows contain the values corresponding to the fields listed in the header. If you are not sure how to do this, contact Emarsys Support.
- Do not use more than 255 columns.
- Do not use field values longer than 64 KB.
- Make sure the required fields are entered case-sensitively, or otherwise those will not be recognized by the system. 'Price' is not the same as 'price'. 'Price' is recognized as a custom field and will get a prefix, such as f_, which will look like f_Price. In this case the price field will be missing.
Adding columns to a working sales data file
If you have a working implementation of Smart Insight, you can still add columns to your sales data CSV file without a full reload.
There are two crucial things to pay attention to when adding the new columns:
- add the new columns at the end of the configured header (column list)
- keep the existing order of the columns
If you change the order of the existing columns or insert the new columns between the existing ones, some of the existing historical sales data will be overwritten. We cannot recover such data, so if this happens, only a full Smart Insight reload is able to restore the lost data.
For in-depth information on CSV file editing, see Modifying CSV files in use.
Standard field set
These fields are required for the standard feature set in Emarsys.
Example: Your Emarsys account is set to Eastern Time. The time of the purchase is 09:02 AM Central Time, Dec 16th. We want the timestamp value in the sales feed to be 2021-12-16T10:02:00-05:00. Since the switch to daylight saving time does not affect UTC, you need to ensure the offset reflects that change, as well. With the given timestamp value and if the next daylight savings event is moving time forward, then the timestamp would look like this: 2021-12-16T10:02:00-04:00.
Warning: If you upload the sales data file via the API, then the timestamp field cannot contain more than 25 characters.
Important: Make sure to use either
If you are using Emarsys Loyalty, you will also need to have these fields in your sales data file.
All Loyalty fields need to be set as filterable in Smart Insights Settings > Purchase fields.
||Please insert here all coupons and voucher codes your user used at checkout. If the user used more than one, you can add multiple codes by separating them with a comma. If the coupon was not used for a specific item, you can send the same coupon code for all items purchased, we will only consider the unique codes per each order. Loyalty uses this field for the referral program to know that the user referred made a purchase with the coupon code received from a loyalty member. For more information, see Referral program
Cancellation and refunds requirements for Loyalty
When submitting returned items or orders, always use the order, customer and item IDs of the original order and provide a negative value in both the price and the quantity fields, as well as in
f_original_price. Please make sure to populate the fields in your sales data file this way, otherwise refunds will not be properly represented in your reports.
I use the Shopify plugin but I cannot add additional columns to my sales data. What shall I do?
If you are using the Shopify plugin, there is no need to add the three additional fields for Loyalty, as the plugin always only works on one Shopify website - market. Loyalty will use the
price field instead of the
I only work in a single market and single currency, but I want the price column to include only the net price. What shall i do?
In this case, provide the additional three fields, and the
f_original_price should include the price for which you want to give out points, namely the gross price.
s_store_id fields are part of a feature that is currently on release for Retail clients only.
If you are using the Strategic Dashboard for Retail customers feature, you will also need to have these fields in your sales data file.
||Yes||String||“online"||The sales channel of the order. Its value is either “online” or “offline”.
||Yes||String||“87"||The external ID of the store.|
Besides the mandatory sales data fields, you can also add other information to your sales data file by using custom fields. The names of custom fields must begin with a prefix specifying the data type. The prefix then must be followed by an underscore (
_) and the name of the custom column. If the name of a field consists of multiple words, then each word must start with a lowercase letter and each word should be separated by an underscore. For example,
Available custom field prefixes:
- Custom fields are validated for data type and if the submitted values are not consistent with the expected data type, the sales data file will be rejected.
- Custom fields can be empty. However, for numeric fields (
price , quantity , i_ , f_) they cannot have empty strings. For example, double quotes with no values in it
""are not accepted. Numeric fields with a NULL value, for example double commas:
Sales data example
The following example shows orders for a shoe store with three custom columns (the
timestamp is specified in UTC) :
item,price,order,timestamp,customer,quantity,s_color,f_size,t_published 103-nike-1,24.95,112343,2016-06-06T14:02:00Z,093275854,1,blue,14.5,2016-09-20T15:00:00Z 105-adidas-66,99.95,112343,2016-06-06,493027334,3,black,14.5,2016-09-20T15:00:00Z 107-nike-24,50.99,66234,2016-06-06T17:00:00Z,672985375,1,pink,13.5,2016-08-01
In the following example, the
timestamp format is specified in local time with time zone offset information (the time zone is UTC+6):
order,timestamp,customer,item,price,s_channel,quantity,f_unit_price,s_storeid,s_currency,s_colour_code,s_colour_name,s_size_code,s_barcode,s_department,s_class,s_season,s_brand,f_total_discount,f_total_ex_tax,s_voucher_number,s_discount_reason MJB_228762E0022735,2020-07-29T08:47:32+06:00,341472,SHI106S17_Sky,20.30,OFFLINE,1,29,ONLINE,AUD,Sky,Sky,37,9342743705097,Shirts,Item,Spring17,XYZ_Brand,8.70,11.09,,Promotional Discount MJB_228762E0022735,2020-07-29T08:47:32+06:00,341472,SHI267S17_Blue,20.30,OFFLINE,1,29,ONLINE,AUD,Blue,Blue,37,9342743724463,Shirts,Item,Summer17,XYZ_Brand,8.70,11.09,,Promotional Discount MJB_228762E0022735,2020-07-29T08:47:32+06:00,341472,SHI229S18_PinkNavy,20.30,OFFLINE,1,29,ONLINE,AUD,Pink/Navy,Pink/Navy,37,9342743788434,Shirts,Item,Summer 18,XYZ_Brand,8.70,12.91,,Promotional Discount MJB_228840E0022736,2020-07-29T08:48:42+06:00,538691,SHI302S18_Navy,34.30,OFFLINE,1,49,ONLINE,AUD,Navy,Navy,48,9342743790086,Shirts,Item,Spring 18,XYZ_Brand,14.70,25.64,,Promotional Discount MJB_228856E0022737,2020-07-29T08:49:41+06:00,527149,SHI289S18_Denim,20.30,OFFLINE,1,29,ONLINE,AUD,Denim,Denim,S,9342743789516,Shirts Casual,Item,Summer 18,XYZ_Brand,8.70,9.28,,Promotional Discount MJB_228825E0022738,2020-07-29T08:50:43+06:00,462561,SHI140W19_SkyWhite,34.30,OFFLINE,1,49,ONLINE,AUD,Sky/White,Sky/White,42,9342743802130,Shirts,Item,Autumn19,XYZ_Brand,14.70,25.64,,Promotional Discount MJB_228835E0022739,2020-07-29T08:52:05+06:00,527806,SHI210S18_Blue,20.30,OFFLINE,1,29,ONLINE,AUD,Blue,Blue,46,9342743784351,Shirts,Item,Summer 18,XYZ_Brand,8.70,11.09,,Promotional Discount MJB_228835E0022739,2020-07-29T08:52:05+06:00,527806,SHI284S18_Sky,20.30,OFFLINE,1,29,ONLINE,AUD,Sky,Sky,46,9342743785556,Shirts,Item,Summer 18,XYZ_Brand,8.70,11.09,,Promotional Discount
Returns, cancellations and refunds
When submitting returned items or orders, always use the order, customer and item IDs of the original order and provide a negative value in the
quantity and the
f_original_price fields. Please make sure to populate the fields in your sales data file this way, otherwise refunds will not be properly represented in your reports.
A customer might buy three items:
|Order ID||Date||Customer ID||Item ID||price||quantity||s_market||f_original_price||s_original_currency|
A week later they decided to return two of the items, which should be reported in the sales data file as follows:
|Order ID||Date||Customer ID||Item ID||price||quantity||s_market||f_original_price||s_original_currency|
These two order parts will then be aggregated into a single order with the total value of 45, while the order date will remain unchanged (i.e. Emarsys keeps the date of the original order).
Smart Insight and sales data FAQ
Why is API the preferred method for uploading sales data?
Uploading your data via the sales data API has major advantages compared to using SFTP for that purpose. Among other things:
- Sales data uploaded through the API does influence product recommendations, while sales data uploaded through Smart Insight SFTP does not. The method itself makes a difference.
- By using SFTP, your file is not validated straight away, which makes troubleshooting slower.
- SFTP accounts need to be set up, while the sales data API is totally self-service.
- With SFTP, you always have to make sure to give a unique name to your sales data file to avoid overwriting previous uploads.
What happens if the customer identifier (email or customer_id) is missing from some or all of the rows in the sales data file?
You can only upload your sales data if the customer identifier is missing from some or all rows in the sales data file via the Smart Insight SFTP method (not via Predict API). In this case, Emarsys will create one pseudo contact for your account, and attribute all orders where the customer identifier is missing to this contact. If you often submit your sales data files with missing customer IDs, Emarsys may create a new pseudo contact for each affected order.
Pseudo contacts are given a unique customer ID in our contact database and appear as generated on the Smart Insight screens.
In general, try to make sure that this does not happen as this can skew your Smart Insight scoring and reporting.
What happens if a customer is not found in the Emarsys contact database?
If you use the Smart Insight SFTP for uploading your sales data, orders submitted with an email or customer ID missing from the Emarsys contact database will still be processed, and the unknown contact will be given a customer ID and will be flagged as generated in the Customer Lifecycle reports. Using the generated filter, this contact can be separated from regular contacts in your contact database.
If such generated contacts are explicitly identified later, they will turn into regular contacts at the next data load.