So I wrote my first smallish web app in Crystal and it’s been running for over two months now without much of supervision (thanks to monit and systemd). It basically, um, collects some metadata from IMDB every week or so. Based on that data I’m trying to find underrated movies worth watching. Some general thoughts for future-me:
I like Crystal’s HTTP lib, it seems cleaner compared to Ruby (where I immediately switch to excon, faraday or some other gem)
Sidekiq port works like a charm and memory usage is sweet. I’m not concerned much about speed itself as I need to collect data gently anyway. Pushing jobs from rails app after switching redis connection in Sidekiq is also a painless process - it opens some interesting opportunities for mixing those two languages in production
Composing SQL queries get clunky pretty fast as expected (I had some problems in Go once my first app grew). lucky_record from lucky framework looks promising tho. Seems I can’t live with some sort of ORM no more.
I miss some build-in auto-reload/file watcher and probably I should incorporate one into my editor for Crystal projects
I should pay more attention to return types, those are inferred by the compiler, but avoiding nil union types is probably a good idea
Semantic UI is really cool for prototyping :P out of the box you get all the building blocks you can use, neat - as a mostly backend developer I found it very handy
What about the app itself?
Update 2019-02-10: You can find follow-up on the app here
You can find it here. You need to use basic auth: <REDACTED>
/ <REDACTED>
(update VII.2017): have to remove those for now). Because of various reasons I won’t be able to publicly release it as Amazon will probably try to shut it down. Sample reference.
But we can look at the data/fun facts (made easier thanks to metabase). Keep in mind the data is just a small portion of what IMDB is; also I will refer to reviews with ratings only.
So far I have over 126.000 movies in 27 different categories and 2.230.000 reviews (metadata only) for those movies - and those are reviews only with rating included - most of pretty old reviews does not include rating, if I’m not mistaken that feature was added later on around year 1998.
You can’t get past 200th page in categories listing, probably due to Elasticsearch pagination limitations (thanks @sztos for this precious knowledge).
There is a movie written by John Malkovich scheduled to be released in 2115.
Seems that IMDB users tend to write reviews when they really like given movie. A little over 20% of collected reviews have 10-stars rating (on the other hand 1-star reviews are 9% of total).
In year 2006 around 180.000 new reviews were posted - that’s almost 500 reviews a day!
The oldest review is for a movie Gummo, it was added on July 27, 1998.
According to current formula in the app most overrated movie is The Oogieloves in the Big Balloon Adventure with a rating of 6.1 and weighted rating based on reviews of 1.8. And the second one is Star Wars: Episode VIII - The Last Jedi, which I actually liked ;).
Most underrated movie is Superbabies: Baby Geniuses 2 probably due to troll reviews :P. The second one is The Promise.
Most up-voted review (2974 of 2994 helpfulness score) is for a movie The Red Maple Leaf. Note: at least so far, had major hiccup regarding parsing reviews and data is still being updated.
Movies have on average 35 reviews, 7 reviews median. Most reviewed movie is The Dark Knight with almost 5100 reviews!
That’s it for now, if you found that app somewhat useful let me know :).