https://github.com/jljacoblo/BlinkistScrapper
Blinkist API
Blinkist data can be accessed from https://www.blinkist.com/api/
For example:
https://www.blinkist.com/api/books/onboarding
https://www.blinkist.com/api/categories
https://www.blinkist.com/api/categories?includes=book_list
https://www.blinkist.com/api/books/trending
https://www.blinkist.com/api/books/everyday-vitality-en
https://www.blinkist.com/api/books/everyday-vitality–en/similar
https://www.blinkist.com/api/books/everyday-vitality-en/chapters
https://www.blinkist.com/api/books/everyday-vitality-en/chapters/5294dd4d396433000c020000
https://www.blinkist.com/api/books/everyday-vitality-en/chapters/5294dd4d396433000c020000/audio
Looking at Blinkist Source code
You can download all pages HTML, scripts, css, node_medules using chrome extension: Save all Resources
https://github.com/up209d/ResourcesSaverExt
https://chrome.google.com/webstore/detail/save-all-resources/abpdnfjocnmdomablahdcfnoggeeiedb?hl=en-US
This is how I decode the “infinite loop to crash developer console”, and their API calls
Browser Developer Console
Handle crashes
Blinkist put infinite loops in their source code, and will only crashes when open develop console. code from their source code
return Fv().wrap((function(e) {
for (;;) switch (e.prev = e.next) {
case 0:
return n = t.commit, e.prev = 1, e.next = 4, xo.organisations.me();
case 4:
Option 1: Dont stay in book reading page
Option 2: Use Brave browser
Hacking Steps:
- Use Brave browser “Block scripts” feature
- go to Binkist main page
- open developer console
- Disable Block scripts
- Wait for a while for CPU to calm down
- than you can go to book reading pages
Try to experiment on and off Block scripts features. I believe this is how to get arround the infinite loops
load jQuery
Use tampermonkey to inject jQuery library into the webpage
https://github.com/jljacoblo/ActiveTabWebpageCrawlerSynchronous/tree/main/tampermonkey
Scraping
Run the code inside developer console
Categories
Example API calls: https://www.blinkist.com/api/categories?includes=book_list\
Developer console:
Code not shown, sorry
Output slip categories_book_list.json
:
{
"categories": [
{
"id": "5b868435b238e1000726ccba",
"title": "Career & Success",
"slug": "career-and-success-en",
"url": "/en/content/categories/career-and-success-en",
"priority": 26,
"sprite": "career-and-success",
"books_count": 483,
"books": [
...
]
},
...
Books
Total of 5500 books’ data
trending
Example API calls: https://www.blinkist.com/api/books/trending\
Developer console:
Code not shown, sorry
Output slip currentBooks.json
:
{
"the-4-hour-workweek-en": {
"id": "5282267434613800112a0000",
"kind": "book",
"slug": "the-4-hour-workweek-en",
"title": "The 4-Hour Workweek",
"subtitle": "Escape 9–5, Live Anywhere, and Join the New Rich",
"subtitleHtmlSafe": "Escape 9–5, Live Anywhere, and Join the New Rich",
"aboutTheBook": "<p><em>The 4-Hour Workweek </em>(2009) describes ...",
"buyOnAmazonUrl": "/en/books/the-4-hour-workweek-en/purchase",
"author": "Tim Ferriss",
"truncatedAuthor": "Tim Ferriss",
"sourceAuthor": "Tim Ferriss",
"url": "https://www.blinkist.com/en/books/the-4-hour-workweek-en",
"browseUrl": "/en/app/books/the-4-hour-workweek-en",
"previewUrl": "/en/books/the-4-hour-workweek-en",
"readUrl": "/en/nc/reader/the-4-hour-workweek-en",
"playUrl": "/en/nc/reader/the-4-hour-workweek-en?play=1",
"readingDuration": 28,
"minutesToRead": 28,
"publishedAt": "2012-10-16T11:52:22.000+00:00",
"isAudio": true,
"readCount": "40.3k",
"image": {},
"sources": [
...
]
},
"audioUrl": "",
"chaptersLength": 12,
"hasAudio": true,
"language": "en",
"freeDaily": null,
"isFree": false,
"category": {
"title": "Money & Investments",
"sprite": "money-and-investments",
"slug": "money-and-investments-en"
},
"averageRating": 4.3,
"totalRatings": 1671,
"categories": [
...
]
},
...
}
latest
Same as trending, but use api link /api/books/trending
similar
Example API calls: https://www.blinkist.com/api/books/everyday-vitality-en/similar\
Developer console:
Code not shown, sorry
Chapter Summery
Example API calls: https://www.blinkist.com/api/books/everyday-vitality-en/chapters
Needs currentBooks.json
Developer console:
Code not shown, sorry
Output slip curBooksChapters.json
:
{
"the-4-hour-workweek-en": {
"book": {
"id": "5282267434613800112a0000",
"slug": "the-4-hour-workweek-en",
"title": "The 4-Hour Workweek",
"author": "Tim Ferriss",
"time": 28,
"cover": {
"default": {
"src": "https://images.blinkist.io/images/books/5282267434613800112a0000/1_1/470.jpg",
"srcset": {
"2x": "https://images.blinkist.io/images/books/5282267434613800112a0000/1_1/640.jpg"
}
},
"sources": [...]
},
"freeDaily": false
},
"chapters": [
{
"id": "5282270c3334640008020000",
"order_no": 0,
"action_title": "What’s in it for me? Learn to make time for the important things in your life."
},
{
"id": "528227343334640008040000",
"order_no": 1,
"action_title": "For the New Rich, wealth means luxury in the here and now."
},
...
],
"current_chapter_id": null
},
}
Example API calls: https://www.blinkist.com/api/books/everyday-vitality-en/chapters/5294dd4d396433000c020000
Needs curBookChaptersjjson, currentBooks.json
Developer console:
Code not shown, sorry
Output slip booksAllChapters.json
:
{
"the-4-hour-workweek-en": [
{
"id": "5282270c3334640008020000",
"order_no": 0,
"action_title": "What’s in it for me? Learn to make time for the important things in your life.",
"text": "<p>The four-hour workweek. It sounds amazing, right. It sounds like a dream. Instead of working for 40 hours a week, you only ...",
"audio_url": "https://hls.blinkist.io/bibs/5282267434613800112a0000/5282270c3334640008020000-T1632499808.m4a",
"signed_audio_url": "https://hls.blinkist.io/bibs/5282267434613800112a0000/5282270c3334640008020000-T1632499808.m4a?Expires=1673380008&Signature=HaaOrmx3vWsagkZL1dwDvWztHt-DrBUp5Q1XTveLk7MQBYy1FHJdewpFDZVIgRrPEjBMRk5EJFR1JQB0SIToHbiHL10ol1U18NRhiXjTq-DLnxDwtIrnhsdvdeTQpHpV2oTTtf6ubAdhMesemHXc5sqOMq5EVeSShr7NgLfCHiwp-S6Y3nrwb5~Y7~7RPYHXXhp0z2eJcoV-XJc5sqN8-3l9l8JzHj-pFiN-uL6PS14ufbW7j6mlyN6vTePQG1xckh9QzhdbaNqUptUKwYNQjcZeGnNDPwS6pNeoPoMUKJ65OQsIbpxD5q5HsvywuHJmu5B8akr7~JrA2U8fJ18Etg__&Key-Pair-Id=APKAJXJM6BB7FFZXUB4A"
},
{
"id": "528227343334640008040000",
"order_no": 1,
"action_title": "For the New Rich, wealth means luxury in the here and now.",
"text": "",
"audio_url": "",
"signed_audio_url": ""
},
...
],
...
}
Download All Book covers
Require currentBooks.json
In downloadAllBookCover.py
:
Code not shown, sorry
Download all chapters audio
Download all the Books’ Chapter Blinks audio using in-browser developer console.\
Those audio files’ download link has time limit.\
The current method is to download it using browser’s “save as” dialog.\
We use AutohotKey to automate this process, download each audio file one-by-one.
Api calls
Call:
$.ajax('/api/books/everyday-vitality-en/chapters/617033a56cee0700087aa566/audio', {
type: 'GET'
,success: function (data, status, xhr) {
console.log(data);
}
});
Result:
{
"url": "https://hls.blinkist.io/bibs/617033a36cee0700087aa564/617033a56cee0700087aa566-T1634743314.m4a?Expires=1673360760&Signature=XIEsN3WN-WrBz9tWEENgiY9gzG6dqB8dz-vvSuL1sHLIKDkjCloUMvzpUeJKEXsrZfN7rbQV9nw6tYLFXXNh-s5dNOHg0Fsv2PUUd6Mck9pg-OjbSEbtHtSmzrg2PFqxEqRC1jL8eNBVHUDmP4NH0-5uhVJXaTF73CNjR7ritgo98Z4l-y3~wnLtuq1aXQIua224sdBVwzDM9wOx9AUXddz7CGjvlx4R7W9ShRgHe2TzJQS0x~TxDNzHHnigyvs8FSD3EaT9UjyAXz8D7Bt7aX0FU6zHr4RPRNMPb0X2VyDMhVZCRVOKtVPKEAKQ0NDUDZLPkoYye~NN4ZAab1QsZA__&Key-Pair-Id=APKAJXJM6BB7FFZXUB4A"
}
step 1
Create AutoHotkey script downloadAudio.ahk
.
It basically download a herf tag by right-click “Save as” dialog inside the browser
step 2
Convert downloadAudio.ahk
to downloadAudio.exe
.
step 3
Create a custom URL scheme. So inside browser we can run that exe by calling autohotkey://
Something like this:
https://dev.to/pybash/making-a-custom-protocol-handler-and-uri-scheme-part-3-3fji
Basically, you can add url_scheme.reg
into your Windows Registry.
step 4
Now, inside chrome developer console.\
Require curBookChapters.json In pullEachAudioRecursive.js
:
Code not shown, sorry
Convert to Anki note cards
jsonToAnki.py
can convert all the books’ blinks into Anki note card.
Each Anki note card is a chapter of a book.
One book is one Anki Deck, which is grouped by category as sub-decks.\
Require currentBooks.json, curBookChapters.json, booksAllChapters.json
In jsonToAnki.py
:
convertBlinkistJson2Anki('categories','currentBooks', 'booksAllChapters')
Output File: Blinkist.apkg