Graphite Note > FAQ > Get started with Graphite > Tips and tricks for your datasets

Tips and tricks for your datasets

Before you run your model, it is important to get familiar with the essential parameters of the model. This way you will know what the dataset should look like, ie which columns your dataset should have. Also, you should pay attention to the format of the variables.

first of the month
Graphite Note - example of monthly data (the first day of months)

In case you want to run the Timeseries Forecast Model, you will have to select the time-related column and the target column. The target column must be numeric and represents the measurement you want to predict. You can have daily, weekly, or monthly data; be careful about the format for weekly and monthly data. For example, each month must be represented with the first or last date of the month. As for the weekly data, it is important to choose which day of the week will represent the week: if Wednesday is selected, the time-related column must contain the dates for each Wednesday of the week.

The following models we will describe are related to customer segmentation, therefore they have similar parameters. Unlike the Timeseries model, they also have optional parameters. Now, first things first. The RFM Customer Segmentation Model requires 3 columns: a time-related, customer ID, and monetary column. Let’s say your dataset contains sales data for a certain period of time (it contains a list of transactions of all customers). You can select a transaction date as a time-related column and sales amount as a monetary column (monetary column must be numeric). Each customer has their own ID: the customer name is not necessary (therefore, the customer name is an optional parameter). The Customer Cohort Analysis Model has the same required parameters as the RFM Model, along with the transaction/order ID. With the New vs Returning Customers Model, the situation is a little bit different: the time-related column, customer, and order ID column are required, while the monetary column (such as sales amount) is optional.

Of course, each of the datasets may have additional variables - new models are coming soon, while the existing ones will upgrade. Stay tuned! 🙂

Want to Run and Share your Data Predictions with your team?
linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram