Book a Call
Get a Quote

Retool Inventory Management System: Integrating DynamoDB (P2)

Nhan Hoang
October 15, 2024
20 min read
Retool Inventory Management System: Integrating DynamoDB (P2)

Integrating Amazon DynamoDB with Retool enhances your inventory management system by providing a scalable, high-performance NoSQL database solution. This blog explores how to design and implement DynamoDB for your Retool application, focusing on defining access patterns, leveraging schema flexibility, and understanding pricing models to optimize performance and cost.

What and Why DynamoDB?

Amazon DynamoDB is a fully managed, serverless NoSQL database designed to handle massive amounts of data with low latency and high availability. Unlike traditional relational databases that rely on structured schemas and SQL queries, DynamoDB embraces flexibility and speed, making it ideal for real-time applications.

Relational databases were built for an era when storage was expensive, prioritizing minimal data storage and CPU-heavy queries. SQL allows complex data relationships to be built on-the-fly, but as data grows, this process can become slow and resource-intensive.

In contrast, NoSQL databases like DynamoDB are optimized for the modern era, where storage is cheaper and speed is critical. DynamoDB pre-builds answers to defined questions across the dataset, providing fast retrieval through simple API calls without leaning heavily on CPU resources.

How DynamoDB works?

We’re going to explore how DynamoDB works by designing it for RetoolerStock, our inventory management system. We’ll learn key DynamoDB concepts and apply them step-by-step to build the database for RetoolerStock. This way, we’ll learn by doing.

Step 0: Defining Access Patterns

Let’s begin by outlining the general workflow of the application, and from there, define the access patterns we’ll use in DynamoDB.

A warehouse manager can manually update the inventory of products, including product details and stock levels, either directly or through integration with other services, such as an order management system.

If the stock of a product falls below a defined threshold, the system will automatically trigger an order for a specific quantity from the appropriate supplier and generate a shipping ticket.

Once the goods arrive at the warehouse, the manager can confirm the shipment, update the inventory, and record the restocking event in the inventory history.
Defining Access Patterns

Next, we’ll define a list of access patterns to help structure our table properly and ensure optimal query performance.

  1. Get all products with its stock levels.
  2. Get a product with its stock level.
  3. Get all products that nearly or already out of stock.
  4. Get all suppliers.
  5. Get a supplier.
  6. Get all products of a supplier.
  7. Get restocking level of a product.
  8. Get latest unfinished order detail of a product.
  9. Get all orders made to a specific supplier.
  10. Get a order detail.
  11. Get orders made in a specific date (range).

Step 1: How DynamoDB Works?

The first step is to define the main table that will store our product inventory. In DynamoDB, we store data in tables, and each table contains items (similar to rows in relational databases). Each item has attributes (like columns).

How DynamoDB Works?

DynamoDB uses a primary key to uniquely identify each item in a table. The primary key can be just a partition key (a single attribute) or a composite key (partition key + sort key).

Knowing that NoSQL work in very different philosophy compare to relational database, we need to avoid think about database design like we did in SQL, there are no join operators or foreign key. Usually working with DynamoDB, we only need one table, that called Single Table Database Design. To form one-to-one, one-to-many and many-to-many relationships in only one table, we going to use some patterns, called Item Collections and Adjacency List.

Learn more about supported data types in DynamoDB.

Read more about: Horizontal vs Vertical Scaling, SQL vs NoSQL and Normalization vs Denormalization

Step 2: Schema Flexibility

One of DynamoDB’s greatest strengths is its schema flexibility. Unlike relational databases where every row must have the same schema, DynamoDB allows each item to have different attributes. This is helpful if different products need different information.

Schema Flexibility

Step 3: Understand DynamoDB Pricing

To understand how AWS billed us of using DynamoDB, it's important to understand a few key concepts:

Consistency Models in DynamoDB

DynamoDB offers two types of consistency when reading data:

  • Eventual Consistency: This is a faster option that allows data to be read without immediate confirmation that it’s the most up-to-date version. The server doesn’t need to coordinate across multiple nodes, which speeds up the read operation. This is ideal when real-time accuracy isn’t critical.
  • Strong Consistency: This ensures that the most up-to-date version of the data is always returned, requiring the server to confirm the data’s accuracy by coordinating across multiple nodes. However, this increases the read time.

In most cases, where real-time updates aren’t essential, eventual consistency is preferred because it offers faster performance.

DynamoDB Capacity Modes

DynamoDB offers two capacity modes:

  • On-Demand Mode: DynamoDB automatically scales based on application traffic, making it ideal for unpredictable workloads. It requires less manual intervention but comes at a higher cost—6 to 7 times more expensive than Provisioned Mode for the same throughput.
  • Provisioned Mode: In this mode, you must specify how many Read Capacity Units (RCUs) and Write Capacity Units (WCUs) are needed. This model follows a pay-per-hour pricing, even if you’re not fully utilizing the reserved capacity. Improper provisioning can lead to throttling (slower performance). A common approach is to start with On-Demand Mode to assess traffic patterns, then switch to Provisioned Mode to optimize costs.

Read Capacity Units (RCUs)

An RCU represents the capacity to perform one strongly consistent read per second for an item up to 4 KB in size. For eventually consistent reads, one RCU can handle two reads per second.

  • Example: If you need to read an 8 KB item, it would require 2 RCUs for a strongly consistent read or 1 RCU for an eventually consistent read.

Write Capacity Units (WCUs)

A WCU represents the capacity to perform one write per second for an item up to 1 KB in size. If the item size exceeds 1 KB, more WCUs are required for the write operation.

  • Example: Writing a 2 KB item would consume 2 WCUs.
Notice: Even if we only modify a small attribute of a large 20KB item, we will still consume a number of Write Capacity Units (WCUs) equivalent to the entire item’s size.

Impact of RCUs and WCUs on Pricing

The number of RCUs and WCUs you use directly impacts the pricing structure in DynamoDB:

  • In Provisioned Mode, you're billed based on the number of RCUs and WCUs reserved per hour, regardless of whether they’re fully utilized.
  • In On-Demand Mode, billing is based on the actual RCUs and WCUs consumed.

If you choose strong consistency for reads, the cost will be roughly double that of eventually consistent reads.

Based on the concepts discussed, here are some best practices for optimizing DynamoDB usage:

  • Split Large Data Items: Break down large data items into smaller items based on their access frequency. This helps reduce the number of RCUs required for reading, as smaller items are more efficient to retrieve.
  • Design for Eventual Consistency: Whenever possible, design your application to use eventual consistency. This reduces costs and improves performance unless there is a specific need for strong consistency. Only opt for strong consistency when real-time accuracy is absolutely necessary.
  • Cache: Consider using a caching layer, such as Amazon ElastiCache, to reduce the number of read requests to DynamoDB.

Step 4: Scan or Query for results

DynamoDB automatically shards our table into multiple partitions, distributing them across several servers. This enables horizontal scaling of our database. But how does DynamoDB know which partition to write an item to or retrieve the correct item when the table is distributed across many servers? DynamoDB uses the partition key, which is passed through a hash function to determine the appropriate partition.

A hash function is a mathematical function that takes input (in this case, the partition key) and generates a fixed-size output, which DynamoDB uses to assign the item to a partition.

By default, DynamoDB allows us to query data using only the partition key (and optionally the sort key). Also known as a Query operation.

Query: Since DynamoDB needs the partition key to identify which partition holds the item, you must provide the partition key and, optionally, the sort key to perform a query. A query operation is highly efficient because it directly looks up items using the partition key (and sort key if needed). DynamoDB quickly finds the correct partition and returns only the data you’re asking for, making queries fast and cost-effective.

When filtering data in a query, you are charged for the total size of data scanned before any filtering occurs. For example, if you scan a table and then apply a filter to reduce the results, you will still pay based on the total size scanned, not just the items returned.

However, there are times when we need to retrieve data based on other attributes that aren’t part of the primary key, like searching for suppliers by email or finding products from a specific supplier. Then a scan operation is required.

Scan: A scan checks every item in the table, regardless of the partition key, since it doesn’t know which partition contains the item. This is like a full table search, making scans slow and costly, especially for large tables. Scans should be avoided when performance is critical, unless you’re working with small datasets or specific cases that don’t allow querying.

If you find that a scan suits your use case but performance is a deal-breaker, a Global Secondary Index (GSI) can help. GSIs allow you to query the table using a different partition key and sort key based on non-primary key attributes, improving performance while avoiding the cost and inefficiency of a scan.

Global Secondary Index (GSI)
Whenever the base table is updated, any affected attributes in secondary indexes also need to be updated, which incurs additional costs. Instead of projecting non-key attributes into indexes, it’s often more cost-effective to perform extra queries on the main table for those attributes. Querying the main table is generally cheaper and offers higher throughput compared to the expense of updating secondary indexes.

Compare table
Learn more about Secondary Indexes in MongoDB:
  • Local Secondary Index (LSI)
  • Global Secondary Index (GSI)
  • Sparse Index

Step 4.5: Choosing Primary Key

When choosing a primary key (Partition Key and Sort Key) for DynamoDB, it’s crucial to follow certain guidelines to ensure efficient data storage and querying.

Cardinality refers to the number of unique values in a key:

  • High Cardinality: Keys like customer ID or order ID, which have many distinct values, are ideal because they distribute data evenly across partitions.
  • Low Cardinality: Keys like status (e.g., “active” or “pending”) have few unique values, leading to hotspots—overloaded partitions—which can cause throttling and degrade performance.

When the partition key has high cardinality, it ensures that data can be distributed across many servers effectively. Why?

  • Balanced Load: With high cardinality, you spread data evenly across different partitions (which correspond to different servers). Each distinct value in the key will be routed to a different partition.
  • Avoiding Hotspots: If you choose a low-cardinality partition key (like a Boolean true/false or a status with only a few values), it creates hotspots. A hotspot is when many records get stored in the same partition, leading to uneven data distribution. If one partition holds too much data or is accessed frequently, it can get overloaded, leading to throttling or performance degradation.

The same principles apply to GSIs as they do to base tables. If GSIs partition key has low cardinality (few unique values), many writes will be directed to a small number of partitions, creating hotspots. These hotspots can lead to a server becoming overwhelmed, which causes the GSI to throttle. If the GSI is throttled, it can’t keep up with the updates from the base table. Since DynamoDB enforces consistency between the base table and GSIs, this backpressure can propagate back to the base table. As a result, even writes to the base table may start to get throttled, and even slowing down or throttling the entire system.

When choosing a sort key, it depends on your access patterns—how you need to query data. The sort key should support ordering or range queries, such as sorting by date or status.

The real power of DynamoDB comes from combining the partition key and sort key to optimize for your access patterns. You need to consider how you’ll retrieve and access the data in your application. Here’s a step-by-step process:

  1. Define your access patterns: Understand how you will query the data—by customer, by date, by status, etc.
  2. Choose a partition key: Pick something that is queried frequently and has high cardinality.
  3. Choose a sort key: Pick a field that supports your ordering needs and range queries.
  4. Full Scan or GSIs? Depends on how often you need to access specific patterns in your data. If an access pattern is only needed once or twice a day, it’s more cost-effective to perform a Full Scan rather than maintaining additional GSIs. GSIs can incur extra costs and add overhead to write operations (since each write must also update the GSIs) and storage costs, particularly in write-heavy applications.

Step 5: Apply to RetoolerStock

To do: Insert a video demonstrate how to design the database for RetoolerStock and setup AWS DynamoDB table accordingly.

Video 1: https://drive.google.com/file/d/1nXsjxLOTcPHYfggPwiNqD9kL_wiI17Ww/view?usp=drive_link

Video 2: https://drive.google.com/file/d/1iKPU6Rs_k6mBct7ZRx0n9tRXidCfzfvz/view?usp=drive_link

Learn More about DynamoDB

  1. DynamoDB Streams (later in the series)
  2. Access Control
  3. Amazon DynamoDB Accelerator (DAX)

Get in Touch

Ready to bring your dashboard vision to life? Contact Retoolers today, and let us help you create a powerful, intuitive dashboard that meets your exact requirements.

Nhan Hoang
Retool Developer