Data Lakes

Many FinTechs, financial institutions and other platforms have built (or are building) bespoke analytics/reporting solutions for various proprietary use cases. If your enterprise likes to leverage large analytical data, the Hurdlr API has you covered via its Data Lakes product.

1. How it works

Of course, Hurdlr allows you to access all the data that is available for your users in the API. But, Hurdlr takes this concept several levels further by periodically uploading raw data to your own Amazon S3 or Google Cloud Storage bucket.

Then, you can pull that data into the warehouse of your preference (e.g. Snowflake, Stitchdata, etc.), eliminating the need for your data team to build direct connections to the Hurdlr API, and allowing your team to work in the data stack that they are used to.

The data is structured such that your team can perform any analysis necessary to meet your team's goals.

2. Data types

Hurdlr will upload data related to the following elements of Hurdlr API/SDK information and interactions on daily, weekly, or monthly intervals:

A. Bank linkage

A Parquet file containing information on all linked bank accounts. The columns include the bulk of data available in the /banks/accounts endpoint, including userId, apiInstitutionId, apiAccountNo, apiAccountType, createdDate, lastExpenseSyncedDate, lastRevenueSyncedDate, apiCurrentBalance, and more. Each file contains all the records that were updated since the last data upload.

B. Invoices

A Parquet file containing information on all invoices. The columns include the bulk of data available in the /invoices endpoint, including userId, date, totalAmount, dueDate, sentDate, lastViewedDate, status, and more. Each file contains all the records that were updated since the last data upload.

C. Lifecycle Events

A CSV file containing information on actions each user has taken, which is often useful for optimizing user experiences/funnels. The columns include userId, date, event, and event-specific metadata. The full list of events being tracked is available in the Lifecycle Event documentation.

D. Transactions

A Parquet file containing information on all transactions. The columns include the bulk of data available in the /transactions endpoint, including userId, apiInstitutionId, apiAccountNo, date, amount, bankDescription, categoryName, and more. Each file contains all the records that were updated since the last data upload.

3. Getting started

To set up your own cloud storage bucket to receive Hurdlr API data, simply follow the instructions for each of our Data Lake options below:

A. Amazon S3 Data Lake
B. Google Cloud Storage Data Lake