sparktk naive_bayes
Functions
def load(
path, tc=<class 'sparktk.arguments.implicit'>)
load NaiveBayesModel from given path
def train(
frame, label_column, observation_columns, lambda_parameter=1.0)
Creates a Naive Bayes by training on the given frame
frame | (Frame): | frame of training data |
label_column | (str): | Column containing the label for each observation |
observation_columns | (List[str]): | Column(s) containing the observations |
lambda_parameter | (float): | Additive smoothing parameter Default is 1.0 |
Returns | (NaiveBayesModel): | Trained Naive Bayes model |
Classes
class NaiveBayesModel
A trained Naive Bayes model
Example:
>>> frame = tc.frame.create([[1,19.8446136104,2.2985856384],
... [1,16.8973559126,2.6933495054],
... [1,5.5548729596, 2.7777687995],
... [0,46.1810010826,3.1611961917],
... [0,44.3117586448,3.3458963222],
... [0,34.6334526911,3.6429838715]],
... [('Class', int), ('Dim_1', float), ('Dim_2', float)])
>>> model = tc.models.classification.naive_bayes.train(frame, 'Class', ['Dim_1', 'Dim_2'], 0.9)
>>> model.label_column
u'Class'
>>> model.observation_columns
[u'Dim_1', u'Dim_2']
>>> model.lambda_parameter
0.9
>>> predicted_frame = model.predict(frame, ['Dim_1', 'Dim_2'])
>>> predicted_frame.inspect()
[#] Class Dim_1 Dim_2 predicted_class
========================================================
[0] 1 19.8446136104 2.2985856384 0.0
[1] 1 16.8973559126 2.6933495054 1.0
[2] 1 5.5548729596 2.7777687995 1.0
[3] 0 46.1810010826 3.1611961917 0.0
[4] 0 44.3117586448 3.3458963222 0.0
[5] 0 34.6334526911 3.6429838715 0.0
>>> model.save("sandbox/naivebayes")
>>> restored = tc.load("sandbox/naivebayes")
>>> restored.label_column == model.label_column
True
>>> restored.lambda_parameter == model.lambda_parameter
True
>>> set(restored.observation_columns) == set(model.observation_columns)
True
>>> metrics = model.test(frame)
>>> metrics.precision
1.0
>>> predicted_frame2 = restored.predict(frame, ['Dim_1', 'Dim_2'])
>>> predicted_frame2.inspect()
[#] Class Dim_1 Dim_2 predicted_class
========================================================
[0] 1 19.8446136104 2.2985856384 0.0
[1] 1 16.8973559126 2.6933495054 1.0
[2] 1 5.5548729596 2.7777687995 1.0
[3] 0 46.1810010826 3.1611961917 0.0
[4] 0 44.3117586448 3.3458963222 0.0
[5] 0 34.6334526911 3.6429838715 0.0
>>> canonical_path = model.export_to_mar("sandbox/naivebayes.mar")
Ancestors (in MRO)
- NaiveBayesModel
- sparktk.propobj.PropertiesObject
- __builtin__.object
Instance variables
var label_column
var lambda_parameter
var observation_columns
Methods
def __init__(
self, tc, scala_model)
def export_to_mar(
self, path)
Exports the trained model as a model archive (.mar) to the specified path
Parameters:
path | (str): | Path to save the trained model |
Returns | (str): | Full path to the saved .mar file |
def predict(
self, frame, columns=None)
Predicts the labels for the observation columns in the given input frame. Creates a new frame with the existing columns and a new predicted column.
Parameters:
frame | (Frame): | Frame used for predicting the values |
c | (List[str]): | Names of the observation columns. |
Returns | (Frame): | A new frame containing the original frame's columns and a prediction column |
def save(
self, path)
def test(
self, frame, columns=None)
def to_dict(
self)
def to_json(
self)