Machine Learning Model Functions
supports the following SQL functions for machine learning models.
The scope of a machine learning model is the schema.
Supported machine learning models are divided into these categories.
You can create a machine learning model, rename the model, export the syntax for the model creation, retrain the model, execute a query against the model, and drop the model.
Train a new machine learning model of type <model type> on the result set returned by the SQL SELECT statement. After the database creates the model, <model name> becomes a callable function in SQL SELECT statements.
Syntax
Parameter | Data Type | Description |
---|---|---|
model name | VARCHAR | The name of the model to create. |
Parameter | Data Type | Description |
---|---|---|
model type | VARCHAR | The type of machine learning model to create. |
These models are supported. You can find full descriptions of each model in Regression Models, Classification Models, Clustering and Dimension Reduction Models, or Other Models.
- SIMPLE LINEAR REGRESSION
- MULTIPLE LINEAR REGRESSION
- POLYNOMIAL REGRESSION
- LINEAR COMBINATION REGRESSION
- VECTOR AUTOREGRESSION
- KMEANS
- KNN (K Nearest Neighbors)
- LOGISTIC REGRESSION
- NAIVE BAYES
- NONLINEAR REGRESSION
- FEEDFORWARD NETWORK
- PRINCIPAL COMPONENT ANALYSIS
- LINEAR DISCRIMINANT ANALYSIS
- SUPPORT VECTOR MACHINE
- DECISION TREE
- GAUSSIAN MIXTURE MODEL
- ASSOCIATION RULES
Parameter | Data Type | Description |
---|---|---|
option_list | VARCHAR | The options for the specified machine learning model that is specified as a comma-separated list in the format: <option name 1> -> <value 1>, <option name 2> -> <value 2> , and so on. Names and values must be all enclosed in single quotes and are case sensitive with the exception that Boolean values can be true, false, TRUE, or FALSE. Refer to the respective model for the full options list. Example options list: options( 'yIntercept' -> '10', 'metrics' -> 'true' ) |
The SQL SELECT statement that serves as the basis for the model must return rows that fit the specified requirements of the model. For example, in multiple linear regression, the first N columns are the independent variables and the last column is the dependent variable.
You cannot create a machine learning model with an existing schema and name combination.
Example
Assume that you created the mldata table that contains the data for the model. Then, you can create the my_model machine learning model based on that data.
Create a machine learning model when the model with the specified name does not exist. Otherwise, replace the existing model. You can use the existing model with the existing data and the existing options until the replacement process completes.
Syntax
Refer to CREATE MLMODEL for the parameters.
Example
Assume that you created the mldata table that contains the data for the model. Then, you can create or replace the my_model machine learning model based on that data.
Rename a machine learning model.
Syntax
Parameter | Data Type | Description |
---|---|---|
model name | VARCHAR | The name of the model to rename. |
new model name | VARCHAR | The new name of the model. |
Example
Drop a machine learning model.
Syntax
Parameter | Data Type | Description |
---|---|---|
model name | VARCHAR | The name of the model to drop. You can drop multiple models by specifying additional model names and separating each with commas. |
Example
To drop a machine learning model, my_model:
To drop multiple machine learning models:
Return the command that can recreate the machine learning model.
Syntax
Parameter | Data Type | Description |
---|---|---|
model name | VARCHAR | The name of the model to create. |
Example
Output
The output command includes the schema explicitly.
Retrain a machine learning model without changing any model options.
Syntax
Parameter | Data Type | Description |
---|---|---|
model name | VARCHAR | The name of the model to retrain. |
Example
Execute a query against a machine learning model.
Syntax
Example
Create a table with data for the model.
Create a multiple linear regression model based on the data.
Execute a SELECT query against the multiple linear regression to see the actual and predicted values. Limit the result set to 10 rows.