Text Style Streamline Icon: https://streamlinehq.com
Article

Behind the Scenes: Automating Heatmap Generation for Large Geochemical Datasets

January 20, 2025

Hey there, tech enthusiasts and geo-wizards! Let’s take a quick peek under the hood of a MinersAI geological heatmap generator? Today, we're pulling back the curtain to show you exactly how MinersAI transforms raw geochemical data into those beautiful heatmaps you see on your screen, all while handling multiple requests without breaking a sweat.

The Journey of Your Heatmap Request

Let’s walk through what happens when you hit that “Generate Heatmap” button. Trust me, it’s more fascinating than you might think!

Step Functions conceptual workflow diagram detailing our serverless heatmap generation pipeline

Efficient Data Retrieval and Processing

When you initiate a heatmap generation request, our FastAPI backend immediately validates your parameters and triggers a Step Functions workflow. The first crucial step is retrieving your geochemical data from OpenSearch. We chose OpenSearch specifically for its ability to handle complex spatial queries efficiently. Our implementation allows us to process millions of data points while applying sophisticated filtering based on your requirements.

The retrieved data undergoes initial processing where we clean, normalize, and prepare it for interpolation. This step is crucial for ensuring accurate results, especially when dealing with geochemical data that often contains outliers or requires specific transformations.

The Core: Spatial Interpolation

The heart of our heatmap generation lies in the Kriging interpolation process. Our implementation carefully considers the spatial relationships between data points to create accurate predictions across your area of interest. The process involves calculating spatial autocorrelation through variogram analysis and using these relationships to predict values at unsampled locations.

The core of our interpolation process relies on PyKrige’s Ordinary Kriging implementation. What makes our system flexible is the level of control we give to our users. Before interpolation begins, users can handle their data’s Lower Limit of Detection (LOD) values in three ways:

  • Exclude values below LOD
  • Set them to 1/3 of detection limit
  • Set them to 1/2 of detection limit

We also handle outliers through Z-score calculations, where users can set their own threshold for what constitutes an outlier in their specific geological context.

Users aren’t locked into a single interpolation method either — they can choose between Kriging and IDW based on their needs. For those opting for Kriging, we provide additional customization options including choice of variogram model and colormap for the final visualization. This level of customization ensures that geologists can adapt the interpolation process to their specific geological context and visualization preferences.

Creating the raster

The interpolation results are then transformed into a visual representation. Our system generates a GeoTIFF file, a format chosen for its ability to maintain geographical reference information while providing efficient storage and transfer capabilities. We apply carefully calibrated color scales that highlight important variations in your geochemical data, making patterns and anomalies easily identifiable.

The resulting files undergo optimization to ensure quick loading times without sacrificing quality. We utilize advanced compression techniques that maintain the integrity of your data while reducing file sizes significantly. These optimized files are then stored in S3, providing reliable and quick access when you need to view or download them.

Why Our Architecture Makes a Difference

Our serverless architecture provides several key advantages for heatmap generation. By leveraging AWS Lambda, we can process multiple requests simultaneously without performance degradation. Step Functions ensure each stage of the process is monitored and executed reliably, with automatic error handling and retries when necessary.

The combination of OpenSearch for data retrieval, containerized processing for consistent results, and S3 for efficient storage creates a robust pipeline that handles the complexities of geochemical data processing effectively. This architecture allows us to maintain high performance even during peak usage periods while ensuring cost-effectiveness during quieter times.

Performance in Production

In real-world usage, our system consistently delivers impressive results. Processing times remain stable even with large datasets, and our automatic scaling ensures that multiple users can generate heatmaps simultaneously without experiencing delays.

The quality of your geochemical heatmaps depends not just on the interpolation algorithm, but on the entire processing pipeline. Our architecture ensures that every step, from data retrieval to final visualization, is optimized for accuracy and efficiency.

A high-resolution geochemical heatmap showcasing mineral distribution patterns, generated using our Kriging interpolation pipeline
By clicking “Accept All Cookies”, you agree to the storing of cookies on your device to enhance site navigation, analyze site usage, and assist in our marketing efforts. View our Privacy Policy for more information.