So far, you could only create a dataset by uploading a CSV file. But let's face it, every business has a huge amount of data stored in various databases, so why bother with CSV. Depending on the employer's request, various systems sort data and extract the most important from them using SQL. Lucky for you, with Graphite, you can connect to your database and write your own SQL. Let's figure out how to do it.
As soon as you log in to Graphite, go to Datasets and click on Create New. You can choose a connection to MySQL/MariaDB or PostgreSQL database. While other connections are being developed, such as MS SQL, Amazon RedShift, etc., there is a little hack: in case your only data source is RedShift, just create a PG connection with Redshift parameters and the connection should work.
After selecting a connection, define the name for your dataset. Additionally, you can write a description, or select/create a tag.
Now we come to the most important part, establishing a connection. You have to enter your server hostname or IP address, database port, database user, database password, and database name. After that, click the Check Connection button. To enable the connection to your database, please ensure that your firewall accepts incoming requests from the following two IP addresses: 126.96.36.199 and 188.8.131.52.
After your connection is established, it's time to show us your SQL knowledge - write the desired SQL and click the Run SQL button to get your data.
By scrolling down, all the columns from the selected dataset will appear. If necessary, you can change column names, data type, or data format; click on the Create button to create your dataset. It's much easier to get data from databases using SQL - you adjust the dataset to your needs! By repeating the above steps, you can easily get your data and start running various models without writing down any line of code. 🙂