Data Management

You will be challenged to:

  • Configure import activity using various file formats within Marketing Cloud
  • Apply SQL to produce the desired results
  • Explain the different ways to extract data from Marketing Cloud
  • Describe SQL best practices for managing data in Marketing Cloud
  • Apply best practices for send logs
  • Describe how data is affected by the Contact delete process.

Data Views

Get to know these system views by heart:

  • _listsubscribers
  • _subscribers
    • Parent Business Unit only
    • To determining the Business Unit level subscriber status is by running a Tracking Extract from the Business Unit and include 'All Subscribers' in your extract. The resulting subscribers.txt file will contain that status relative to that business unit.
  • _sent
  • _open
  • _click
  • _bounce
  • _unsubscribe
  • _complaint
  • _job
  • _journeyactivity
  • _journey

Data Extract


Import Activity

  • Types
  • Can be used to
    • Add, Add/Update, Update, Overwrite
  • If the file you are importing is encrypted or compressed, you can use a file transfer activity to prepare the file for import.
  • Marketing Cloud Connect users can create an import activity to populate a data extension with object and report data from other clouds via Synchronized Data Sources.
  • You can specify the character encoding your imports use by contacting your Salesforce admin. The encoding you enable applies to all imports in an account.
  • Marketing Cloud assumes that the FTP server in the selected file transfer location reports time in UTC. UTC assumes no time zone offset. If the remote FTP server is not utilizing UTC, adjust limits specified on the Properties page, including System buffer, during activity setup to ensure the activity runs on the intended schedule.

File Transfer Activity

Encrypt files with SFMC Public Key

Use this activity for a file found on the FTP to:

  • Unzip Files - drops file in the same folder
  • Decrypt Files - Requires Private Key
    • PGP / GPG Decryption
  • Move file from FTP to safehouse
    • Safehouse is not an accesible place
    • It is a data store at the platform level
  • Files will remain on the sFTP for at least 21 days

Subscription Management


Sendlogs

  • Gives you access to data not available via standard tracking functions in the Salesforce Marketing Cloud application. You can also specify how long that information will be available depending on your data retention needs
  • You can run query activities on the send logging data extension to return information on the information contained in that data extension.

Best Practices

  • Important to assign a data retention policy, 10 days recommended
  • Keep the number of custom fields low, 10 or fewer
  • When creating a field type Text
    • Do not leave its length blank
    • Avoid assigning a huge length (1000+)

Synchronized Data Extensions

  • Criteria on data import

Contact Delete Feature

  • Full understanding of delete process
  • Best Practices
    • To improve Contact Delete performance, break your requests into at least 6 smaller requests and submit each request 5 minutes apart.
    • To improve the speed of this process, delete any unnecessary sendable data extensions in your account.
    • You can delete up to one million records.
  • Order of Operation
    • Deletion process initiated > Suppression period > Final deletion
  • Only contacts in these studios can be delieted
    • Email Studio
    • MobileConnect
    • MobilePush
    • Contact Builder
  • Feature needs to be enabled in the parent account level

Deletion Process

  1. Process is initiated
    • Enterprise 2.0 Accounts - Deletion should be initiated from the parent account level
    • When the process is initiated, suppressed contacts will
      • Disappear from Journey Builder - It prevents entry by any contact with the same ContactKey
      • Disappear from Contact builder - It prevents creation of any new contact record with the same ContactKey
      • Disappear from Audience Builder - Deleted ContatKeys are excluded from audiences
      • Not receive sends or appear in Email Studio - It prevents addition of any contact with the same ContactKey via imports or triggered sends
      • Not received sends or appear in MobileConnect - During this phase, triggered sends and responses to subscriber' messages aren't delivered
      • Not receive sends or appear in MobilePush
      • Not receive sends or appear in GroupConnect - Follow requests are ignored. Block requests are processed. No auto-responses to incoming messages are triggered
      • Not receive sends or appear in Web Recommendations, Email Recommendataions, and Web and Mobile Analytics.
  2. Suppression period
    • 14 days - it can be reduced to 0 (zero)
    • Suppress sends, only - no communications will be sent to that contact
    • The recommended period is set to 0 (zero), however it totally depends on your account design
  3. Deletion occurrs
    • Removed a contact from your Marketing Cloud instance, not only a BU
    • Contacts are deleted from lists and sendable data extensions - Non-sendable DE needs to be part of population, to be included in the deletion process
    • In the backend, the KEYS are preserved and the personal information anonymized, so it doesn't affect data reporting
      • Tracking information of a contact will be deleted
      • Aggregate data is still preserved

  • Sends, Imports, automation, query activites... have priority over deletion process
    • The deletion will occur, over the night UTC system time, when no other process is running
    • Deletion process has the lowest priority in the stack
  • Limitations
    • 1 million records per data extension / list
    • Only 1 key can be used to identify the contact to delete
      • Subscriber Key or Contact Key
      • Email is not an option
  • REST API Call
    • The deletion call process key by key from the payload, asynchronously - the process can last serveral hours to complete
    • Perform it from the parent account, not from the BU
    • You can delete contacts in 3 different types of request
      • 1st, by Contact ID - POST /contacts/v1/contacts/actions/delete?type=ids
      • 2nd, by Contact Key - POST /contacts/v1/contacts/actions/delete?type=keys
      • 3rd, by List - POST /contacts/v1/contacts/actions/deletetype=listReference
    • Request payload, sample
{
    "ContactTypeID":0,
    "values":["ContactKey1","ContactKey2","ContactKey3"],
    "DeleteOperationType":"ContactAndAttributes"
} 

SQL

Best Practices

  • Split long/complex queries in multiple steps
    • To increase performance
    • To avoid timeout
  • Use brackets to mention the name of field or data extensions that have a space in their name
  • Use alias to assign temporary name for tables, to make query easy to read

Order of Execution

SELECT DISTINCT column, AGG_FUNC(column_or_expression), …
FROM mytable
    JOIN another_table
      ON mytable.column = another_table.column
    WHERE constraint_expression
    GROUP BY column
    HAVING constraint_expression
    ORDER BY column ASC/DESC

JOIN Statements

Used to retrieve data from 2 or more data extensions

  • INNER JOIN
    • Return the matched values from 2 or more tables
    • In other words, it looks for records that exists in both tables
  • LEFT JOIN
  • RIGHT JOIN
  • OUTER JOIN

UNION

Combine the results of two or more queries in one result

  • Requirements
    • Fields' data type must be the same
    • Number of fields must be the same across queries
    • Order of fields must be the same across queries
  • UNION
    • Eleminates dupliates
    • Unifies only unique records
  • UNION ALL
    • Unifies all records, even duplicate records

Functions

Operations

  • Not allowed: INSERT, DELETE, UPDATE
  • Allowed: SELECT

Good to Know

  • TOP used without ORDER BY can result in different results, every time it runs
  • TOP with ORDER BY returns the same result set every time it runs
  • Based on MS SQL server 2008 capabilities
  • The server is in Centeral Standard Time (CST)
    • In case your instance is in a different timezone, data conversion functions should be considered
  • ROW_NUMBER() + PARTITION BY + ORDER BY can be used to get the most recent data

Subqueries