=============== Surprise scikit =============== .. figure:: img/surprise.* :align: right :alt: Surprise Logo :width: 250px .. contents:: :local: Description =========== Suprise is a Python scikit for recommender systems based on explicit rating data. Thus, it does not support implicit ratings or content-based information. It is an easy-to-use scikit to build, test and compare different algorithms for recommender systems. A complete documentation was created and can be found in the Documentation section. The name Surprise stands for *Simple Python RecommendatIon System Engine*. Installation ============ With pip: .. code-block:: bash $ pip install numpy $ pip install scikit-surprise With conda: .. code-block:: bash $ conda install -c conda-forge scikit-surprise Getting started =============== Here is a simple example showing how you can (down)load a dataset, split it for 5-fold cross-validation and compute the MAE and RMSE of the SVD algorithm. .. code-block:: python from surprise import SVD from surprise import Dataset from surprise.model_selection import cross_validate # Load the movielens-100k dataset (download it if needed). data = Dataset.load_builtin('ml-100k') # Use the famous SVD algorithm. algo = SVD() # Run 5-fold cross-validation and print results. cross_validate(algo, data, measures=['RMSE', 'MAE'], cv=5, verbose=True) Output: .. code-block:: python Evaluating RMSE, MAE of algorithm SVD on 5 split(s). Fold 1 Fold 2 Fold 3 Fold 4 Fold 5 Mean Std RMSE 0.9311 0.9370 0.9320 0.9317 0.9391 0.9342 0.0032 MAE 0.7350 0.7375 0.7341 0.7342 0.7375 0.7357 0.0015 Fit time 6.53 7.11 7.23 7.15 3.99 6.40 1.23 Test time 0.26 0.26 0.25 0.15 0.13 0.21 0.06 How to use ========== The first thing to do is to select a algorithm and import it. It will be used later when we'll do a prediction of ratings. A full list of all available algorithms can be found in the documentation. Then we need to load the data, either with a file, or with the ``load_from_df()`` method. The pandas Dataframe must possess three distinct fields : Users, Items, Ratings. For loading the data from a pandas Dataframe : .. code-block:: python from surprise import Dataset from surprise import Reader from surprise import KNNWithMeans reader = Reader(rating_scale=(1, 5)) data = Dataset.load_from_df(df_alloys_feedback[["User", "Item", "Rating"]], reader) We could also simply fit our algorithm to the whole dataset, rather than running cross-validation. This can be done by using the ``build_full_trainset()`` method which will build a trainset object : .. code-block:: python #similarity parameters sim_options = { "name": "cosine", #pearson, cosine, msd "user_based": True, #user or item based similarity } algo = KNNWithMeans(sim_options=sim_options) trainingSet = data.build_full_trainset() #train on the full set of data algo.fit(trainingSet) You can now predict ratings by calling the function ``predict()``. Let's say you want the prediction of the user 213 for the item 132 (make sure they are both in the trainset!). .. code-block:: python user_id = 213 item_id = 132 # get a prediction for specific users and items. prediction = algo.predict(user_id, item_id) prediction.est This will return the estimation of the prediction computed by the algorithm. Output : .. code-block:: python 8 If you want to get more information on the prediction by passing the actual value of the rating : .. code-block:: python user_id = 213 item_id = 132 # get a prediction for specific users and items. prediction = algo.predict(user_id, item_id, r_ui=4, verbose=True) This should return you : .. code-block:: python user: 196 item: 302 r_ui = 4.00 est = 4.06 {'actual_k': 40, 'was_impossible': False} Create your own prediction algorithm ==================================== If you are interested in creating your own prediction algorithm, the documentation is there to help you with a step by step guide on how to do it. `Create your own prediction algorithm `_ Documentation ============= For additional information and documentation on how to use the scikit Surprise: `Surprise documentation `_ :tag:`Machine Learning` :tag:`Surprise` :tag:`Python`