Springstreet.io
Start Searching
  • Welcome to better lending data
  • About
    • What is Springstreet
    • Who is Springstreet for
    • Contact Us
  • Our Data
    • A Background on UCCs
    • How we improve UCC Data
    • Data Dictionary
    • Bulk Data Access
  • Basics
    • Debtor Search
    • Secured Party Search
    • Financing ID Search
Powered by GitBook
On this page
  • Our Data Infrastructure
  • Data Format
  • Data Export Process
  • Accessing your Data Feed
  1. Our Data

Bulk Data Access

Enterprises can now access nationwide lending and borrowing activity in bulk to help supplement an enterprise's existing data operation.

Welcome to Springstreet.io's enterprise data solutions. We provide unparalleled access to our proprietary Uniform Commercial Code (UCC) lien dataset, offering deep insights into commercial lending, credit risk, and market trends. This document outlines the nature of our data, our robust data infrastructure, and how your enterprise can leverage this information for strategic advantage.

Our Data Infrastructure

Springstreet.io leverages a modern, cloud-native data architecture built on Amazon Web Services (AWS) to ensure our data is managed securely, scales efficiently, and is readily accessible to our enterprise customers.

  • Amazon Redshift Serverless: At the core of our analytical environment is Amazon Redshift Serverless. This powerful data warehouse allows us to process and manage vast quantities of UCC lien data efficiently. The serverless architecture automatically scales compute resources based on demand, ensuring optimal performance and cost-effectiveness without the need for traditional cluster management.

  • Amazon S3 (Simple Storage Service): Our primary data lake and a secure repository for exported data files. S3 provides durable, highly available, and scalable storage for the Parquet files delivered to our enterprise customers.

  • Apache Parquet File Format: Data is exported and delivered in the Apache Parquet format, an optimized columnar storage file format ideal for analytics.

Data Format

We provide our UCC lending dataset in the Apache Parquet format. Parquet is a highly efficient, open-source, columnar storage format that offers significant advantages for analytical workloads:

  • Columnar Storage: Stores data by column rather than by row. This leads to:

    • Improved Query Performance: Analytical queries often only need to access a subset of columns. Parquet allows query engines to read only the necessary columns, drastically reducing I/O.

    • Better Compression: Data within a column tends to be more homogenous, leading to higher compression ratios and reduced storage footprint.

  • Schema Evolution: Supports schema evolution, allowing for additions or modifications to the data structure over time without breaking compatibility.

  • Wide Compatibility: Supported by virtually all major data processing frameworks and query engines in the big data ecosystem (e.g., Apache Spark, Amazon Athena, Redshift Spectrum, Presto, Hive).

Your enterprise data feed will consist of Parquet files, typically compressed using Snappy for a good balance of compression ratio and decompression speed. Each file contains a portion of the UCC lien dataset, structured with clear column headers corresponding to the data points described earlier.

Data Export Process

Our internal data pipeline ensures that the UCC lien data is processed, validated, and prepared for enterprise consumption. The export process from our Redshift Serverless environment to Amazon S3 is designed for efficiency and data integrity:

  1. Data Aggregation & Preparation: Relevant UCC lien data is selected and prepared within our Redshift Serverless data warehouse.

  2. Optimized Unload: After our quality assurance process, we export data directly to a designated Amazon S3 location. This process is configured to create Parquet files.

  3. File Organization: The exported Parquet files are organized in Amazon S3 making it easy for your systems to locate and process new or updated data. We provide a manifest file that lists all the Parquet files associated with a specific export, simplifying data ingestion on your end.

Accessing your Data Feed

Springstreet.io offers flexible and secure methods for our enterprise customers to access their subscribed UCC lending data feed. The primary method involves direct access to Parquet files in a dedicated Amazon S3 bucket.

Primary Access Method: Dedicated Amazon S3 Bucket

  1. Secure S3 Bucket: Upon agreement, Springstreet.io will provision a secure Amazon S3 bucket (or a specific prefix within a shared bucket) where your enterprise data feed (Parquet files) will be delivered.

  2. IAM-Based Access: Access to this S3 bucket is granted via AWS Identity and Access Management (IAM). We will work with your technical team to set up cross-account IAM roles or specific IAM user credentials with least-privilege permissions (typically read-only access to the designated S3 location).

  3. Data Ingestion: Your enterprise can then use standard AWS SDKs, AWS CLI, or compatible ETL tools to list and download the Parquet files from the S3 bucket into your own data warehouse, data lake, or analytical environment.

Security:

  • Encryption at Rest: All data in S3 is encrypted using server-side encryption (SSE-S3 or SSE-KMS).

  • Encryption in Transit: Data transfer to and from S3 is secured using HTTPS/TLS.

Additional Support

To learn more about Springstreet.io's enterprise UCC lending dataset and discuss your specific data requirements, please contact our team:

We can provide sample data, detailed schema information, and work with your technical teams to establish a smooth and secure data delivery process.

PreviousData DictionaryNextDebtor Search

Last updated 10 days ago

Email:

hello@springstreet.io