How it Works
TikTokPy works without needing any login information or cookies. This is powerful for anonymity and avoiding timeouts, but this does come with its limitations.
TikTokPy grabs information in two steps:
Warning
The following explanation is only valid for library versions prior to 0.2.0. This page will eventually be updated to reflect the new behavior.
Grabbing Preloaded Content
When you load a TikTok page, some JavaScript performs an API call and inserts the data directly into the HTML of the page. This data is used to insert any preloaded videos, comments, users, etc. This data is not removed from the HTML, so TikTokPy grabs it do initialize its data models.
The data is saved in the HTML in a script tag:
<script id="SIGI_STATE" type="application/json">
{
<!-- API response packed as JSON, to be loaded into Pydantic models -->
}
</script>
Only certain portions of the data is actually retrieved, and some of it is post-processed for user simplicity.
Grabbing Additional API Data
As more of the page is loaded, some scripts make some additional API calls to retrieve more data to display on the page. This data is not saved anywhere in the HTML, so TikTokPy intercepts these API calls and makes them itself. It uses the cookies and headers that would have been used by the page scripts to make the API calls. This allows TikTokPy to be aware of and make the API calls made when you normally browse TikTok without needing any login information.
Limitations
Slideshows
TikTok limits slideshows to mobile viewing. In order to retrieve data from slideshows, you must emulate a mobile device:
from tiktokapipy.api import TikTokAPI
def get_challenge_videos_scroll_down():
with TikTokAPI(emulate_mobile=True) as api:
slideshow = api.video(link_to_slideshow)
...
from tiktokapipy.async_api import AsyncTikTokAPI
async def get_challenge_videos_scroll_down():
async with AsyncTikTokAPI(emulate_mobile=True) as api:
slideshow = await api.video(link_to_slideshow)
...
See Examples for more.
Data Collection
By default, TikTokPy only grabs preloaded content and any API calls made during the page load. This is fairly limiting for the amount of data you can retrieve. In order to grab more data, additional API calls will be needed to be triggered. This can only be done by scrolling down on the page. The amount to scroll down by is determined by how long you want TikTokPy to scroll down for:
from tiktokapipy.api import TikTokAPI
def get_challenge_videos_scroll_down():
# scroll down for 2.5 seconds when making requests
with TikTokAPI(scroll_down_time=2.5) as api:
challenge = api.challenge(tag_name)
for video in challenge.videos:
...
from tiktokapipy.async_api import AsyncTikTokAPI
async def get_challenge_videos_scroll_down():
# scroll down for 2.5 seconds when making requests
async with AsyncTikTokAPI(scroll_down_time=2.5) as api:
challenge = await api.challenge(tag_name)
async for video in challenge.videos:
...
Warning
This will not work with mobile emulation. In order to take advantage of slideshow scraping alongside additional
data scraping with scroll_down_time
, you will need to create a second, mobile-emulating API instance. You will
not be able to retrieve any data that isn’t retrieved during page load with this second API.
Note
scroll_down_time
is also an available option on API calls. Specifying a value here will override the default
value set in the API constructor.