The importance of data in dynamic pricing

With the ongoing boom in online shopping, implementing dynamic pricing has never been more important than it is right now. Prices need to make sense within an increasingly competitive landscape, and your business’ pricing model needs to be ready to adapt to fluctuations in customer demand and purchasing behaviors. The ability to take quick, informed action around pricing has a massive impact on overall profit margins.

Traditionally, pricing in retail was set based on static price rules that utilized a limited amount of data inputs (e.g. cost base, conversion rates, etc.) With this approach, massive amounts of important data – both transaction data and non-purchase data – went under utilized. Data which could inform smarter, more agile pricing decisions!

In today’s hyperfast, highly competitive retail landscape, data-based dynamic pricing strategies are harnessing the power of this consumer data and using it to drive pricing decisions. The explosive growth of big data and the potential it contains for developing AI and machine learning approaches to pricing strategies has unlocked new opportunities for intelligent pricing solutions. Machine learning technology takes dynamic pricing to the next level, as it can process much larger data sets and can consider various influencing factors to predict the effect of price changes.

For companies retailing online, consumer behavior and that data generated by it should be a major focus. By considering consumer behavior when approaching pricing, companies continue to price at the value a customer ascribes to a given product, they also work to manipulate that perception of value, measure, and increase it. Inputs to this new approach to pricing come both from the data generated by the buyers’ behavior, as well as from the larger competitive landscape.

Today, thanks to artificial intelligence and machine learning, retailers can more readily get a robust view of what both competitors and customers are doing at any given moment, as well as a better sense of the influences and reasons behind their buying behavior. The wealth and sheer quantity of data online consumers generate is enabling new, better-informed strategies to drive customer happiness and company profitability.

Big data in retail

With the rapid growth in online, mobile, and social media-driven shopping, there has been an explosion of data, leading to “big data” now becoming available. And when it comes to pricing, price changes are happening instantly, several times a day for millions of products across the entire online ecosystem – amounting to tens of millions of pricing decisions around the world every single day. This represents tens if not hundreds of millions of associated data points!

It is likely that this growth in data will continue to increase. In the retail sector in particular, customers leave a data trail every time they shop online. Plus, there are other data sources such as competitor prices, weather data, and internal company data, like the data gathered on the company’s marketing activities. Retailers typically see big data as falling into two primary categories:

Structured data: this type of data includes names, addresses, transactions history, loyalty programs, and mostly any other data that involves an “amount” type of measurement.

Unstructured data: this type of data includes product reviews, images, social likes and mentions, and any other social media data.

Advanced dynamic pricing tools make it possible to compile this enormous amount of structured and unstructured data and use it to implement a comprehensive strategy. And big data is the resource that allows a dynamic pricing strategy to work. This technology is centered on the optimization of data processing (and all that doing so entails). It’s not just about collecting a large amount of data, but using the collected data to optimize your pricing. More and more retailers are moving toward using this wealth of data to make better pricing decisions and to keep a competitive edge. The massive amount of data available is therefore also a great driver of dynamic pricing.

Any retailer or brand utilizing dynamic pricing is dependent on the data that is utilized to base decisions on. In that regard the quality of the data is very important, as it feeds into an algorithm which will be used to develop and inform demand forecasting and automated pricing decisions.

Data and dynamic pricing algorithms

By analyzing their massive quantities of available data, in combination with current market events and other external data sources, retailers can optimize their prices for the customers with the help of algorithms. We can differentiate between two types of algorithms that are being used in pricing:

  • Traditional, rule based algorithms: the logic of these algorithms has been explicitly programmed. They often consist of a range of “if/then” rules that determine prices based on a range of influencing factors.
  • Machine learning based algorithms: these algorithms learn on a set of training data to make predictions on the price effect on sales, revenue and profit. Based on that forecast, one can run optimizations to reach business targets. This algorithm and its logic of prediction is not explicitly programmed. The algorithm continuously learns from new data.

A limitation of traditional algorithms is that they can only consider a limited amount of influencing factors, often less than three. Managing and monitoring this rule-based approach also takes a fair amount of time and effort. In contrast, by utilizing big data in conjunction with a machine learning based approach, retailers are now more equipped to define the most appropriate pricing strategy for their business.

With this advanced technology, it’s possible to run processes on quantities of data that an individual pricing manager would not be able to manage. A machine learning based approach can calculate demand for individual items in a business’ assortment – which when dealing with thousands of SKUs, is a monumental task – and it can automatically consider a range of influencing factors, both internal (e.g. stock levels, target sales dates) as well as external (e.g. competitor data, time-based and weather factors). Learn more about applications of machine learning in retail here.

The types of data used for dynamic pricing optimization

A key reason retailers, particularly online retailers, are turning to AI and machine-learning based dynamic pricing is because of the massive amount of data and external factors that need to be considered for price determination: sales data, inventory levels, competitor data, promo data, transaction data, seasonal trends, weather data – and more! All of this data is huge and somewhat unwieldy on its own, but in the age of AI it’s increasingly possible to harness it to process this quantity of data and to extract actionable insights.

In addition to ‘structured’ and ‘unstructured’ types of data mentioned above, it’s important to utilize both micro and macro level data when developing machine learning-based price optimization. This includes:

Micro level data: internal data including sales data & transaction data, product master data, cost data, historic prices, marketing data and business strategies.
Macro level data: this is external market data and influential factors that include competitor data, time-oriented data (e.g. day of the week; the season), and location specific data (e.g. regional data; weather data).

Internal Data

Product Attributes: This includes essential information such as cost, margin ceiling, base price, and MAP price. Product attributes (or product master data) is the digital representation of a retailer’s assortment and an important tool for dynamic price optimization. These data points often include product ID, master-variant assignment, current price, RRP, lower and upper price limit, seasonal identification, brand, color, size, stock level, expiration date or target sales date and much more. Grouping these attributes across categories is an important way of utilizing this type of data. It’s often difficult for the models to learn from data on single products alone, so it’s important to be able to utilize and learn from the data on a category level.

Inventory Levels: This is essential data regarding details about current inventory levels and overall supply. The existing supply is combined (via the inventory tracking) with the existing demand, which is a key determinant in how a dynamic pricing software calculates optimal prices in line with the market.

Transactional Data: this includes all transactions, units sold, price history, and conversions. This also includes buyer information, and manufacturing or sourcing costs. Any machine learning based dynamic pricing software will need a company’s sales and transaction data to calculate the demand for each product in your range. This forms the basis for every price decision of the AI. At the minimum, all sales information is needed, e.g. which items were sold at what price.

Additional transaction information improves the AI’s forecasting quality and enhances its results again. This can include items the customer has viewed, items they put in their shopping baskets, items they deleted or cancelled from their baskets, items they’ve saved or put on a wish list, and items they’ve searched for – as examples. viewed products, created shopping baskets, cancelled shopping baskets, saved watch lists or entered search terms, to name just a few examples.

External Data

Competitor Data 

Competitor data can be far reaching but may include elements such as list price, ship price, buy-box price, FBA, out of stocks, geography, and product reviews and ratings. This data can be gathered by crawling (also called ‘scraping’) software which gathers the information from publicly available sources. Businesses are becoming increasingly sophisticated with regards to trying to limit the ability of their competitors to gather this data.

Days of the Week

Days of the week have an influence on consumer demand. Depending on a company’s business model, they will likely see sales go up or down depending on what day it is. Perhaps they’re more of a weekday business, or a weekend business.

A dynamic pricing strategy can take this data and set prices to increase or decrease over the weekend, based on demand for those specific days. Price optimization solutions that allow businesses to create custom timeframes for accurate implementation of one-time, ongoing, or limited-time price changes.


Upcoming holidays will increase demand for certain items, for example wrapping paper in advance of Christmas, or flowers at Mother’s Day. By utilizing historic transaction data plotted against holiday seasons, retailers can pinpoint what items in their assortment show an increased demand and when. This data helps a dynamic pricing algorithm in forecasting demand and setting an optimal price for those respective items.

Regional Trends

E-commerce based retail has the advantage of being able to reach a much wider audience via online based platforms and marketing. However, localized factors and conditions may influence demand from different regional or geographic segments depending on what is occurring in their area. For instance, one region may be celebrating a festival or event which is driving up demand for certain products. Being able to use data to measure these regional variances can help in creating pricing strategies and distinctions down to the regional level, if desired.

Weather and Seasonal Data 

Weather can affect the sales, both as a broader pattern, as well as of certain products. For example, good weather is bad for online retail, while if the weather outside is bad  people will stay home and shop online. From a product perspective, when the temperature increases, consumers will start searching for standing fans and are more likely to buy them (e.g. higher probability of conversion). Another case in which weather data could be valuable is as winter approaches. When the temperature decreases, search volume for skis will increase. Temperature data and weather forecast data can therefore help in predicting demand and optimizing prices accordingly.

How does data influence the dynamic pricing algorithm?

Quantity of data

Not all data points are accessible or even applicable for every business. If a retailer is new to the market, for example, it may not have access to customer testimonials. In addition, companies are rightly very cautious about using personal consumer data. However, since the machine learning model is always adapted to the individual case, it can be designed to deliver good results even without these data points. For example, personal data is not even necessary for price optimizations at the product level.

As mentioned above however, machine learning approaches to dynamic pricing are particularly tailored to learning from large data sets. The greater the quantity of data available, the better the machine learning model can be trained to make accurate forecasts. A machine learning algorithm learns from past price changes and the effect the changes have on sales. Therefore it’s goods to have data from 2-3 prices changes per product with a significant amount of associated sales. Smarter algorithms, like the ones we build at 7Learnings, do not necessarily need data from each product – they can learn across the volume of data from higher selling products and optimize prices for long tail, low selling products as well. Learn more about long tail pricing with machine learning here.

 Quality of data

The quality of data collected will have a direct impact on the machine learning model built for dynamic pricing. The higher quality of data, the easier it is to utilize it and to build out features to guide the system. Good quality data is when:

  • The data is complete. In a best case scenario, this would include all requested variables with no or minimal missing observations; and with data tables that match together so it is easy to data points, for instance when matching up product attributes with transaction data.
  • The data is clean. This means that errors, such as missing sales tracking, incorrect product data, or poorly inputted numerical values, don’t negatively impact the overall data set.
  • The data is consistent. When looking at different data points, there is correlation – e.g. profit can clearly be calculated from revenue and costs.

Poor quality data can cause a number of problems, including:

  • Data which isn’t clean can impair the performance of machine learning models, as they are designed to learn with the data they are fed
  • If the data isn’t consistent, it becomes difficult to calculate the desired KPIs, particularly relating to revenue and profit.
  • Incomplete data can make it difficult to build a machine learning model with at all. The more data available, the better a machine learning approach to dynamic pricing can be trained.

Managing large and complex data sets in dynamic pricing Data collection

When it comes to building a tailored, machine learning based dynamic pricing solution, the targeted internal data is gathered and transmitted by the clients on a daily basis. They prepare the data on their end, extracting it from their data warehouse and making sure all of the proper variables are contained. There are also automated programs that help to manage this data upload and move to cloud storage.

When it comes to external data, this is gathered on the basis of business strategy and needs. For instance, if a company sells sunglasses, it is of particular importance for them to track weather forecast data on a regular basis. For these types of data, open APIs are utilized to gather this information on a daily basis.

Competitor data is gathered in different ways, usually by a web crawling service or software that gathers the requested data points from open sources. This is increasingly difficult to do as companies are becoming more protective of their data.

Data processing

Managing large and complex datasets can be challenging. At 7Learnings, we use a cloud based data warehouse which provides a scalable solution to manage millions of observations in a very fast and efficient manner. After data is collected, it must be cleaned of errors and prepared for further processing. This step is challenging because data of different formats from different sources must be merged. The task is therefore performed by experienced data scientists who ensure that the data is correctly and completely transformed into an algorithm.

Once the data has been gathered and gone through an initial cleaning process, it is stored in single tables which are then combined step by step to combine and further prepare the data to feed into the algorithm. Each type of company, with its unique KPIs and market scenarios, will represent its own data model, which will need different data preparations. These preparations are usually managed via the cloud-based data warehouse as well.

Data inputting

From the prepared data, data scientists will then create features –  input variables for the machine learning models – that are tailored to specific companies and take account of their unique strategic needs. Depending on the particular needs of a company, different features may be more or less suitable for explaining an outcome.The type of machine learning model and which parameters are used can also differ very much from case to case.

For example, companies may occasionally run out of stock of important products. stock may run out from time to time for important products. Therefore, for these particular products it’s important to let the machine learning model know whether or not they are available. So, for those products it’s quite helpful to let the ML know whether the products are available (or have been in the past).

As another example, sometimes a set of products may behave similarly within the model as their price changes, but in the backend will not be grouped together by any existing information, such as product category. However the machine learning model can tell that they are often bought together, or purchased by the same kind of customers. This could be for example, ingredients for a recipe that are needed for a trending recipe. In this instance, it is possible to create a feature that indicates that these products might be considered together.

Making your data work for you

When it comes to dynamic pricing, the key is to fully analyze and understand the wealth of data at a company’s disposal. Good quality data, but particularly good analytics can help companies identify factors that are often overlooked – such as the broader macro economic situation, consumer purchasing behaviors, and external influences – and reveal what drives optimal prices for each customer segment and product.

With huge assortments and all the associated data points that go with it, today it is too expensive and time-consuming for companies to analyze thousands of products manually. By using machine learning based dynamic price optimization, these systems can identify narrow segments, determine what drives value for each one, and match that with historical transactional data. This allows companies to set optimal prices for clusters of products and segments based on data. Automation also makes it much easier to replicate and tweak analyses so it’s not necessary to start from scratch every time, as the pricing algorithm learns and adapts with data over time.