The Prediction Time Of Spark Matrix Factorization

January 29, 2024 Post a Comment

I have simple Python app. take ratings.csv which has user_id, product_id, rating which contains 4 M record then I use Spark AlS and save the model, then I load it to matrixFactoriz

Solution 1:

Basically, you do not want to have to load the full model everytime you need to answer.

Depending on the model update frequency and in the number of prediction queries, I would either :

keep the model in memory and being able to answer to queries from there. For answer < 100ms, you will need to measure each step. Livy can be a good catch but I am not sure on its overhead.
output the top X predictions for each user and store them in DB. Redis is a good candidate as its fast, values can be a list

Baca Juga

Sparkcontext Error - File Not Found /tmp/spark-events Does Not Exist
Is It Possible To Scale Data By Group In Spark?
Aggregating In Panda Dataframe

Learn Python Programming

The Prediction Time Of Spark Matrix Factorization

Solution 1:

Post a Comment for "The Prediction Time Of Spark Matrix Factorization"