130 Commits

Author SHA1 Message Date
rlaphoenix
c0d940b17b Remove Track.needs_proxy
Ok, so there's a few reasons this was done.

1) Design-wise it isn't valid to have --proxy (or via config/otherwise) set a proxy, then unpredictably have it bypassed or disabled. If I specify `--proxy 127.0.0.1:8080`, I would expect it to use that proxy for all communication indefinitely, not switch in and out depending on the track or service.

2) With reason 1, it's also a security problem. The only reason I implemented it in the first place was so I could download faster on my home connection. This means I would authenticate and call APIs under a proxy, then suddenly download manifests and segments e.t.c under my home connection. A competent service could see that as an indicator of bad play and flag you.

3) Maintaining this setup across the codebase is extremely annoying, especially because of how proxies are setup/used by Requests in the Session. There's no way to tell a request session to temporarily disable the proxy and turn it back on later, without having to get the proxy from the session (in an annoying way) store it, then remove it, make the calls, then assuming your still in the same function you can add it back. If you're not in the same function, well, time for some spaghetti code.

---

tldr; -1 ux/design/expectations with CLI, -1 security aspect, -1 code maintenance, but only +1 for potentially increased download speeds in certain scenarios.
2023-12-29 20:25:57 +00:00
rlaphoenix
7cec16d8ab Validate track languages in HLS.to_tracks 2023-12-02 22:40:41 +00:00
rlaphoenix
e87de50940 Exclude fragmented Sub Codecs from DASH UTF-8 checks
Chardet was detecting a mixture of mostly cp1252 and MacRoman encoding, where it should just be left as-is when parsing. The actual text within it perhaps may want to go through `try_ensure_utf8` when parsed, but not the entire box.
2023-12-02 17:44:47 +00:00
Shivelight
c31ee338dc
Add option for automatic subtitle character encoding normalization (#68)
* Add option for automatic subtitle character encoding normalization

The rationale behind this function is that some services use ISO-8859-1
(latin1) or Windows-1252 (CP-1252) instead of UTF-8 encoding, whether
intentionally or accidentally. Some services even stream subtitles with
malformed/mixed encoding (each segment has a different encoding).

* Remove Subtitle parameter `auto_fix_encoding`

Just always attempt to fix encoding. If the subtitle is neither UTF-8 nor CP-1252, then it should realistically error out instead of producing garbage Subtitle data anyway.

* Move Subtitle encoding fixing code out of if drm tree

* Use chardet as a last ditch effort fixing Subs, or return original data

* Move Subtitle.fix_encoding method to utilities as try_ensure_utf8

* Add Shivelight as a contributor

---------

Co-authored-by: rlaphoenix <rlaphoenix@pm.me>
2023-12-02 11:00:55 +00:00
rlaphoenix
4b8cfabaac Fix all Ruff and isort linter errors 2023-12-02 09:57:13 +00:00
rlaphoenix
f3cfaa3ab3 Fix DASH FPS error when SegmentBase is not found 2023-07-15 18:08:01 +01:00
rlaphoenix
6cfbaa7db1 Pass cookies to the aria2c and requests downloaders
For aria2c I've simplified the operation by offloading most of the work for creating a cookie header by just re-doing what Python-requests does. This results in the exact same cookies Python-requests would have used in a requests.get() call or such. It supports multiple of the same-name cookies under different domains/paths based on the URI of the mock request.
2023-05-29 22:23:39 +01:00
rlaphoenix
fd52073605 Skip merging of HLS segments if --skip-dl is used
Partially fixes #61
2023-05-27 20:20:07 +01:00
rlaphoenix
df2f9b85ae Use urljoin instead of an if check and + op in HLS
This used to be used even before devine was public, but it was constantly changed back and forth between an urljoin(), another form of urljoin (something custom or something I can't remember), and an if check + addition.

However, I can confirm that a simple if check will not work as the Base URI might not even be in the same relative root. The if checks have also been inconsistent with some checking if it starts with http(s)://, and some checking if it does not have the base URI at the start of the string.

This if check method does not work as well as an urljoin() has the potential to. It also fixes some services as some HLS playlists would have the m3u8 URL on a completely different root, subdomain, or even domain, causing it to completely break when trying to download segments.
2023-05-21 00:06:30 +01:00
rlaphoenix
8ada6165e3 Set stop event & mark track failed if DASH DRM fails to license 2023-05-19 19:07:35 +01:00
rlaphoenix
6e844409ae Set stop event & mark track failed if HLS Session DRM fails to license 2023-05-19 19:07:06 +01:00
rlaphoenix
c9ecab444f Use range offset when calculating HLS init map byte ranges 2023-05-19 18:38:33 +01:00
rlaphoenix
3e0b7ef200 Fix regression where Range header is accidentally kept and re-used 2023-05-19 00:35:46 +01:00
rlaphoenix
dd64212ad2 Move download_segment() from DASH/HLS download_track() to Class
Various overall small readability improvements have also been made.
2023-05-17 03:20:01 +01:00
rlaphoenix
03c012f88e Move the Downloaded msg after Decrypt mgs in DASH/URL downloads 2023-05-17 02:09:16 +01:00
rlaphoenix
6cdde3efb0 Override the downloader more efficiently in DASH/HLS when Range is used 2023-05-17 01:33:06 +01:00
rlaphoenix
6d4be8620c Only write segment data if the tfhd fix was necessary in DASH 2023-05-17 01:22:59 +01:00
rlaphoenix
681d69d5e5 Mark DASH and URL tracks as Decrypting when using shaka
DASH and normal URL downloads now both decrypt one large single or merged file after all downloads are finished. This leaves a bit of a "pause" between progress bar movement which looks a bit odd. So mark the track as being in a Decrypting state.
2023-05-16 22:01:07 +01:00
rlaphoenix
a45c784569 Replace download speeds with "Downloaded" text when finished 2023-05-16 21:59:03 +01:00
rlaphoenix
2a8307b98d Decrypt DASH downloads after merging all segments
Since DASH doesn't have the ability to change keys dynamically per-track (Representation), there's no need for the DASH downloader to decrypt segments as they are downloaded (like HLS).

This halves the amount of processes needing to be opened as well as the I/O usage. It may result in noticeably lower CPU usage. Since the IOPS is lowered, you may even see an increase in download speed if downloading to something like a meh HDD.

This also fixes decryption in some weird edge-cases where decrypting each segment individually resulted in timestamp anomalies causing shaka to fail.
2023-05-16 21:55:53 +01:00
rlaphoenix
e7dc138c0f Improve readability and documentation of DASH's to_tracks function 2023-05-15 16:19:53 +01:00
rlaphoenix
cb82febb7c Add ability to choose downloader via config 2023-05-12 06:42:33 +01:00
rlaphoenix
b92708ef45 Alter behaviour of --skip-dl to allow DRM licensing
Most people used --skip-dl just to license the DRM pre-v1.3.0. Which makes sense, --skip-dl is otherwise a pointless feature. I've fixed it so that --skip-dl worked like before, allowing license calls, while still supporting the new per-segment features post-v1.3.0.

Fixes #37
2023-05-11 22:17:41 +01:00
rlaphoenix
3ec317e9d6 Pass manifest to DASH downloader instead of re-requesting it
Fixes #51
2023-05-11 20:46:37 +01:00
rlaphoenix
5ca2f256d5 Fix URL used on final chance to get Track KID on DASH downloads
segments[0] is the first tuple, of two values. The URL and an optional byte range. So this accidentally passed the tuple rather than the URL within the tuple.

Fixes #54
2023-05-09 13:04:20 +01:00
Hollander_1908
d894e5bbe0
Was not able to use the initialization from a DASH segment_list (#47)
* Was not able to use the initialization from a DASH segment_list

* Check if initialization in DASH has attribute range
2023-03-26 20:01:17 +01:00
rlaphoenix
71cf2b4016 Fix rare issue where DASH/HLS dl speed divides by 0 2023-03-26 14:30:12 +01:00
Hollander_1908
5eedbe1f59
DASH: improved forced subtitle recognition
Some manifests uses value `forced_subtitle` instead of the regular `forced-subtitle`.
This way both are recognized.
2023-03-14 12:56:34 +01:00
rlaphoenix
ddf1c519e0 Try get track language from representation ID on DASH playlists 2023-03-13 01:09:52 +00:00
rlaphoenix
d8acdda044 Silence DASH and HLS logs unless it's the last attempt 2023-03-12 00:09:02 +00:00
rlaphoenix
055bc927f5 Add a 5-attempt retry system to DASH & HLS downloads 2023-03-11 19:28:02 +00:00
rlaphoenix
111dac9264 Fix association of preceding HLS EXT-X-KEYs with m3u8 fork
This will improve efficiency and accuracy of getting appropriate DRM systems when downloading segments.

This can dramatically improve download speed from less than 50 kb/s to full speed if the HLS playlist used a lot of AES-128 EXT-X-KEYs. E.g., a unique key for each segment.

This was caused because the HLS.get_drm function took EVERY EXT-X-KEY, checked for supported systems, loaded them, and returned the supported objects. This meant it could load possibly 100s of AES-128 ClearKey objects (likely requiring URL downloads for the key URI) causing a huge delay before downloading each segment.
2023-03-09 21:46:48 +00:00
rlaphoenix
abf6c71688 Specify HLS Track Key IDs to prepare_drm
This also moves the init data code before drm related code, just so it has the init data ready to retrieve the Key ID from.
2023-03-08 22:45:41 +00:00
rlaphoenix
a549cc6afb Specify DASH Track Key IDs to prepare_drm 2023-03-08 22:41:58 +00:00
rlaphoenix
573dd8cd49 Don't immediately license DASH DRM until used
This is unnecessary as the DASH track may get converted into an URL track, which will also prepare the DRM.
2023-03-08 21:42:05 +00:00
rlaphoenix
b3fdafcf06 Simplify Base URL joining and calculation on DASH
This also fixes some DASH manifests where it uses multiple BaseURL definitions that must be joined together.
2023-03-07 12:36:00 +00:00
rlaphoenix
d175ffaf15 Add support for byte-range on HLS init maps 2023-03-04 12:21:28 +00:00
rlaphoenix
1b1412d498 Fix byte range calculation on HLS downloads
It was off by one. The final calculation for the right-side range needed to be converted from one-index to zero-index.
2023-03-04 12:18:19 +00:00
rlaphoenix
318832e6b2 Store DRM in the track.drm property in HLS and DASH 2023-03-04 11:49:53 +00:00
rlaphoenix
f8166f098c Apply threading lock to HLS DRM preparation
Without this, if two threads started at the same time there was a very good chance they would run the code and license twice, which is unnecessary.
2023-03-04 11:41:10 +00:00
rlaphoenix
c3a22431f0 Fix possible soft-lock in HLS if Queue is left empty after error 2023-03-04 11:11:20 +00:00
rlaphoenix
9fff14af30 Fix regression that broke pproxy 2023-03-03 08:53:28 +00:00
rlaphoenix
a3efadf00b Fix aria2c's segmented check for DASH/HLS 2023-03-03 07:52:13 +00:00
rlaphoenix
9e23ee13bb Remove silent args in aria2c calls for HLS/DASH 2023-03-03 07:52:13 +00:00
rlaphoenix
9f48aab80c Shutdown HLS & DASH dl pool, pass exceptions to dl
This results in a noticeably faster speed cancelling segmented track downloads on CTRL+C and Errors. It's also reducing code duplication as the dl code will now handle the exception and cleanup for them.

This also simplifies the STOPPING/STOPPED and FAILING/FAILED status messages by quite a bit.
2023-03-01 11:08:52 +00:00
rlaphoenix
624bb6fe75 Only calculate DASH/HLS dl speed if dl sizes are available 2023-03-01 11:08:52 +00:00
rlaphoenix
840db6e689 Move segment merging from dl to DASH/HLS classes 2023-03-01 08:54:35 +00:00
rlaphoenix
f4ad7a2e6c Mark track as stopping when skipping segments 2023-02-28 18:14:03 +00:00
rlaphoenix
383e7d9647 Add full support for CTRL+C on HLS and DASH 2023-02-28 18:05:04 +00:00
rlaphoenix
9cfda3bb9c Don't shutdown pool or the for loop will lock
Since I'm using `futures.as_completed()`, it will never ever for loop over all tracks and segments and will forever be stuck in the primary thread of the operation. I.e., main thread for the download track threads, or the track thread for the download segment threads.

I've also removed all future cancelled checks as they will never be cancelled before they get the chance to run, because no future cancel calls are made anymore.
2023-02-28 18:05:03 +00:00