assume the following stack: cpp <- rproxyA <- rproxyB <- WAN
if A also accepts WAN requests, and A muxes both B and WAN
onto a single connection to cpp, then WAN requests may get
tagged with the IP-address of the most recent B request
aside from the confusing logs, this could break
unpost on servers with shared accounts
socket.accept() can fail silently --
this would crash the worker-pool and also produce
a confusing useless error-message while doing so
reported by someone on a mac with Little Snitch:
uv python install cpython-3.13.3-macos-aarch64-none
uv python pin cpython-3.13.3-macos-aarch64-none
uv sync
uv run copyparty
...but was also observed on x86_64 linux with
python 2.7 in 2018 (no longer reproduces)
fix this to log what's going on and also don't crash
If a file has no known extension the content type gets set to
application/octet-stream causing the browser try and download the file
when viewed directly.
This quickly becomes annoying as many of the files I interact with often
have no extension. I.e., config files, log files, LICENSE files and
other random text files.
This patch uses libmagic to detect the file type and set the
content-type header. It also does this for the RSS feed and webdav for
sake of completeness.
This patch does not touch the front end at all so these files still have a 'txt'
button and a type of '%' in the web UI. But when clicked on, the browser
will display the files correctly.
This feature is enabled with the existing "magic" option. I thought this
fit as the existing functionality also uses libmagic and gives file
extensions to files on upload. Tell me if it should be its own option
instead.
The code base was very confusing, this patch works but I have no idea if
it's the way you'd like this implemented. Hopefully its acceptable as
is.
this change should not alter behavior; the code was already correct
prevents the following message on stdout during startup:
SyntaxWarning: 'return' in a 'finally' block
until now, ${u} would match all users,
${u%-foo} would exclude users in group foo,
${u%+foo} would only include users in group foo
now, the following is also possible:
${u%-foo,%-bar} excludes users in group foo and/or group bar,
${u%+foo,%+bar} only includes users which are in groups foo AND bar,
${g%-foo} skips group foo (includes all others),
${g%-foo,%-bar} skips group foo and/or bar (includes all others)
see ./docs/examples/docker/idp/copyparty.conf ;
https://github.com/9001/copyparty/blob/hovudstraum/docs/examples/docker/idp/copyparty.conf
file hashing became drastically slower in recent chrome versions;
* 748 MiB/s in 131.0.6778.86
* 747 MiB/s in 132.0.6834.160
* 485 MiB/s in 133.0.6943.60
* 319 MiB/s in 134.0.6998.36
the silver lining: it looks like chrome-bug 1352210 is improving
(crypto.subtle, the native hasher, now scales with multiple cores)
* 133.0.6943.60: speed peaked at 2 threads; 341 MiB/s, 485 MiB/s
* 134.0.6998.36: peak at 7; 193, 383, 383, 408, 421, 431, 438, 438
* 137.0.7151.41: peak at 8; 210, 382, 445, 513, 573, 573, 585, 598
MiB/s when hashing with 1, 2, ..., 7, 8 webworkers respectively
on a ryzen7-5800x with 2x16g 2133mhz ram
characteristics of versions between v134 and v137 are unknown
(cannot find old official builds to test), but v137 is a good
cutoff for minimizing risk of hitting chrome-bugs
meanwhile, hash-wasm scales linearly up to 8 cores;
0=328 1=377 2=738 3=947 4=1090 5=1190 6=1380 7=1530 8=1810
(0 = wasm on mainthread, no webworkers)
but it looks like chrome-bug 383568268 is making a return,
so keep the limit of max 4 threads if machine has more than
4 cores (and numCores-1 otherwise)
previously, `--rp-loc` only took effect for trusted reverse-proxies
this was a source of confusion when setting up a config from
scratch, since there is no obvious relation to `--xff-src`
as this behavior was incidental, `--rp-loc` is now always applied,
even if the proxy is untrusted (or not detected at all)
if a hook relocates a file into a folder where that same file
exists with the same filename, the filename-collision-avoidance
would kick in, generating a new filename and another copy
* do not take lock on shares-db / sessions-db when running with
`--ah-gen` or `--ah-cli` (allows a 2nd instance for that purpose)
* add options to print effective salt for ah/fk/dk; useful for nixos
and other usecases where config is derived or otherwise opaque
* mention potential hdd-bottleneck from big values
* most browsers enforce a max-value of 6 (c354a38b)
* chunk-stitching (132a8350) made this less important;
still beneficial, but only to a point
this avoids a false-positive in the info-zip unzip zipbomb detector.
unfortunately,
* now impossible to extract large (4 GiB) zipfiles using old software
(WinXP, macos 10.12)
* now less viable to stream download-as-zip into a zipfile unpacker
(please use download-as-tar for that purpose)
context:
the zipfile specification (APPNOTE.TXT) is slightly ambiguous as to when
data-descriptor (0x504b0708) filesize-fields change from 32bit to 64bit;
both copyparty and libarchive independently made the same interpretation
that this is only when the local header is zip64, AND the size-fields
are both 0xFFFFFFFF. This makes sense because the data descriptor is
only necessary when that particular file-to-be-added exceeds 4 GiB,
and/or when the crc32 is not known ahead of time.
another interpretation, seen in an early version of the patchset
to fix CVE-2019-13232 (zip-bombs) in the info-zip unzip command,
believes the only requirement is that the local header is zip64.
in many linux distributions, the unzip command would thus fail on
zipfiles created by copyparty, since they (by default) satisfy
the three requirements to hit the zipbomb false-positive:
* total filesize exceeds 4 GiB, and...
* a mix of regular (32bit) and zip64 entries, and...
* streaming-mode zipfile (not made with ?zip=crc)
this issue no longer exists in a more recent version of that patchset,
https://github.com/madler/unzip/commit/af0d07f95809653b
but this fix has not yet made it into most linux distros
directory-tree sidebar did not sort correctly for non-ascii names
also fix a natural-sort bug; it only took effect for the
initial folder load, and not when changing the sort-order
also, natural-sort will now apply to all non-numeric fields,
not just the filename like before
* support newlines in svg files;
* `--error--\ncheck\nserver\nlog`
* `upload\nonly`
* thumbnails of files with lastmodified year 1601 would
make the cleaner print a harmless but annoying warning
the thumbnailer / audio transcoder could return misleading errors
if the operation fails due to insufficient filesystem permissions
try reading a few bytes from the file and bail early if it fails,
and detect/log unwritable output folders for thumbnails
also fixes http-response to only return svg-formatted errors
if the initial request expects a picture in response, not audio
remove an overly careful safety-check which would refuse creating
directories if the location was outside of the volume's base-path
it is safe to trust `rem` due to `vpath = undot(vpath)` and
a similar check being performed inside `vfs.get` as well,
so this served no purpose
the up2k databases are, by default, stored in a `.hist` subfolder
inside each volume, next to thumbnails and transcoded audio
add a new option for storing the databases in a separate location,
making it possible to tune the underlying filesystem for optimal
performance characteristics
the `--hist` global-option and `hist` volflag still behave like
before, but `--dbpath` and volflag `dbpath` will override the
histpath for the up2k-db and up2k-snap exclusivey
`--md-hist` / volflag `md_hist` specifies where to put old
versions of markdown files when edited using the web-ui;
* `s` = create `.hist` subfolder next to the markdown file
(the default, both previously and now)
* `v` = use the volume's hist-path, either according to
`--hist` or the `hist` volflag. NOTE: old versions
will not be retrievable through the web-ui
* `n` = nope / disabled; overwrite without backup
specifically google, but also some others, have started ignoring
rel="nofollow" while also understanding just enough javascript to
try viewing binary files as text
download-as-tar-gz becomes 2.4x faster in docker
segfaults on windows, so don't use it there
does not affect fedora or gentoo,
since zlib-ng is already system-default on those
also adds a global-option to write list of successful
binds to a textfile, for automation / smoketest purposes
too restrictive, blocking editing through webdav and ftp
but since logues and readmes can be used as helptext for users
with write-only access, it makes sense to block logue/readme
uploads from write-only users
users with write-only access can still upload any file as before,
but the filename prefix `_wo_` is added onto files named either
README.md | PREADME.md | .prologue.html | .epilogue.html
the new option `--wo-up-readme` restores previous behavior, and
will not add the filename-prefix for readmes/logues
just like before, if vpath contains ${u} then
the IdP-volume is created unconditionally
but this is new:
${u%+foo} creates the vol only if user is member of group foo
${u%-foo} creates the vol if user is NOT member of group foo
previously, when moving or renaming a symlink to a file (or
a folder with symlinks inside), the dedup setting would decide
whether those links would be expanded into full files or not
with dedup disabled (which is the default),
all symlinks would be expanded during a move operation
now, the dedup-setting is ignored when files/folders are moved,
but it still applies when uploading or copying files/folders
* absolute symlinks are moved as-is
* relative symlinks are rewritten as necessary,
assuming both source and destination is known in db
should catch all the garbage that macs sprinkle onto flashdrives;
https://a.ocv.me/pub/stuff/?doc=appledoubles-and-friends.txt
will notice and suggest to skip the following files/dirs:
* __MACOSX
* .DS_Store
* .AppleDouble
* .LSOverride
* .DocumentRevisions-*
* .fseventsd
* .Spotlight-V*
* .TemporaryItems
* .Trashes
* .VolumeIcon.icns
* .com.apple.timemachine.donotpresent
* .AppleDB
* .AppleDesktop
* .apdisk
and conditionally ._foo.jpg if foo.jpg is also being uploaded
was overly aggressive until now, thinking the following was unsafe:
-v 'x::' # no-anonymous-access
-v 'x/${u}:${u}:r:A,${u}' # world-readable,user-admin
-v 'x/${u}/priv:${u}/priv:A,${u}' # only-user-admin
now it realizes that this is safe because both IdP volumes
will be created/owned by the same user
however, if the first volume is 'x::r' then this is NOT safe,
and is now still correctly detected as being dangerous
also add a separate warning if `${g}` and `${u}` is mixed
in a volpath, since that is PROBABLY (not provably) unsafe
`write_dls` assumed `vfs.all_nodes` included shares; make it so
shares now also appear in the active-downloads list, but the
URL is hidden unless the viewer definitely already knows the
share exists (which is why vfs-nodes now have `shr_owner`)
also adds PRTY_FORCE_MP, a beefybit (opposite of chickenbit)
to allow multiprocessing on known-buggy platforms (macos)
previously, the native python-error was printed when reading
the contents of a textfile using the wrong character encoding
while technically correct, it could be confusing for end-users
add a helper to produce a more helpful errormessage when
someone (for example) tries to load a latin-1 config file
android-chrome bug https://issues.chromium.org/issues/393149335
sends last-modified time `-11644473600` for all uploads
this has been fixed in chromium, but there might be similar
bugs in other browsers, so add server-side and client-side
detection for unreasonable lastmod times
previously, if the js detected a similar situation, it would
substitute the lastmod-time with the client's wallclock, but
now the server's wallclock is always preferrred as fallback
* only indicate file-history for markdown files since
other files won't load into the editor which makes
that entirely pointless; do file extension instead
* text-editor: in files containing one single line,
^C followed by ^V ^Z would accidentally a letter
and fix unhydrated extensions
this fixes a DOM-Based XSS when preparing files for upload;
empty files would have their filenames rendered as HTML in
a messagebox, making it possible to trick users into running
arbitrary javascript by giving them maliciously-named files
note that, being a general-purpose webserver, it is still
intentionally possible to upload and execute arbitrary
javascript, just not in this unexpected manner
adds a third possible value for the `replace` property in handshakes:
* absent or False: never overwrite an existing file on the server,
and instead generate a new filename to avoid collision
* True: always overwrite existing files on the server
* "mt": only overwrite if client's last-modified is more recent
(this is the new option)
the new UI button toggles between all three options,
defaulting to never-overwrite
* `xz` would show the "unrecognized volflag" warning,
but it still applied correctly
* removing volflags with `-foo` would also show the warning
but it would still get removed correctly
* hide `ext_th_d` in the startup volume-listing
1. warn about unrecognized volflags
previously, when specifying an unknown volflag, it would
be silently ignored, giving the impression that it applied
2. also allow uppercase, kebab-case
(previously, only snake_case was accepted)
3. mention every volflag in --help-flags
(some volflags were missing)
some clients, including KDE Dolphin (kioworker/6.10) keeps
sending requests without the basic-auth header, expecting
the server to respond with a 401 before it does
most clients only do this for the initial request, which is
usually a PROPFIND, which makes this nice and simple -- but
turns out we need to consider this for GET as well...
this is tricky because a graphical webbrowser must never
receive a 401 lest it becomes near-impossible to deauth,
and that's exactly what Dolphin pretends to be in its UA
man ( ´_ゝ`)
note: `KIO/` hits konqueror so don't
* add support for the COPY verb
* COPY/MOVE: add overwrite support;
default is True according to rfc
(only applies to single files for now)
* COPY/MOVE/MKCOL: return 401 as necessary
for clients which rechallenge frequently
such as KDE Dolphin (KIO/6.10)
* MOVE: support webdav:// Destination prefix
as used by KDE Dolphin (KIO/6.10)
* MOVE: vproxy support
when running copyparty without any config, it defaults to sharing
the current folder read-write for everyone. This makes sense for
quick one-off instances, but not in more permanent deployments
especially for docker, where the config can get lost by accident
in too many ways (compose typos, failed upgrade, selinux, ...)
the default should be to reject all access
add a safeguard which disables read-access if one or more
config-files were specified, but no volumes are defined
should prevent issues such as filebrowser/filebrowser#3719
new global-option / volflag `zip_who` specifies
who gets to use the download-as-zip/tar function;
* 0: nobody, same as --no-zip
* 1: admins
* 2: authorized users with read-access
* 3: anyone with read-access
listen for errors from <img> and <video> in the media gallery and
show an error-toast to indicate that the file isn't going to appear
unfortunately, when iOS-Safari fails to decode an unsupported video,
Safari itself appears to believe that everything is fine, and doesn't
issue the expected error-event, meaning we cannot detect this...
for example, trying to play non-yuv420p vp9 webm will silently fail,
with the only symptom being the play() promise throwing as the
<video> is destroyed during cleanup (bbox-close or media unload)
recent iPads do not indicate being an iPad in the user-agent,
so the audio-player would fall back on transcoding to mp3,
assuming the device cannot play opus-caf
improve this with pessimistic feature-detection for caf
hopefully still avoiding false-positives
support for "owa", audio-only webm, was introduced in iOS 17.5
owa is a more compliant alternative to opus-caf from iOS 11,
which was technically limited to CBR opus, a limitation which
we ignored since it worked mostly fine for regular opus too
being the new officially-recommended way to do things,
we'll default to owa for iOS 18 and later, even though
iOS still has some bugs affecting our use specifically:
if a weba file is preloaded into a 2nd audio object,
safari will throw a spurious exception as playback is
initiated, even as the file is playing just fine
the `.ld` stuff is an attempt at catching and ignoring this
spurious error without eating any actual network exceptions
previously, the `?zip` url-suffix would create a cp437 zipfile,
and `?zip=utf` would use utf-8, which is now generally expected
now, both `?zip=utf` and `?zip` will produce a utf8 zipfile,
and `?zip=dos` provides the old behavior
fixes a bug reported on discord:
a sha512 checksum does not cleanly encode to base64, and the
padding runs afoul of the safety-check added in 988a7223f4
as there is not a single reason to use a filekey that long,
fix it by setting an upper limit (which is still ridiculous)
if an untrusted x-forwarded-for is received, then disable
some features which assume the client-ip to be correct:
* listing dotfiles recently uploaded from own ip
* listing ongoing uploads from own ip
* unpost recently uploaded files
this is in addition to the existing vivid warning in
the serverlogs, which empirically is possible to miss
may improve upload performance in some particular uncommon scenarios,
for example if hdd-writes are uncached, and/or the hdd is drastically
slower than the network throughput
one particular usecase where nosparse *might* improve performance
is when the upload destination is cloud-storage provided by FUSE
(for example an s3 bucket) but this is educated guesswork
try to decode some malicious xml on startup; if this succeeds,
then force-disable all xml-based features (primarily WebDAV)
this is paranoid future-proofing against unanticipated changes
in future versions of python, specifically if the importlib or
xml.etree.ET behavior changes in a way that somehow reenables
entity expansion, which (still hypothetically) would probably
be caused by failing to unload the `_elementtree` c-module
no past or present python versions are affected by this change
when loading up2k snaps, entries are forgotten if
the relevant file has been deleted since last run
when the entry is an unfinished upload, the file that should
be asserted is the .PARTIAL, and not the placeholder / final
filename (which, unintentionally, was the case until now)
if .PARTIAL is missing but the placeholder still exists,
the only safe alternative is to forget/disown the file,
since its state is obviously wrong and unknown
also includes a slight tweak to the json upload info:
when exactly one file is uploaded, the json-response has a
new top-level property, `fileurl` -- this is just a copy of
`files[0].url` as a workaround for castdrian/ishare#107
("only toplevel json properties can be referenced")
when hashing files on android-chrome, read a contiguous range of
several chunks at a time, ensuring each read is at least 48 MiB
and then slice that cache into the correct chunksizes for hashing
especially on GrapheneOS Vanadium (where webworkers are forbidden),
improves worst-case speed (filesize <= 256 MiB) from 13 to 139 MiB/s
48M was chosen wrt RAM usage (48*4 MiB); a target read-size of
16M would have given 76 MiB/s, 32M = 117 MiB/s, and 64M = 154 MiB/s
additionally, on all platforms (not just android and/or chrome),
allow async hashing of <= 3 chunks in parallel on main-thread
when chunksize <= 48 MiB, and <= 2 at <= 96 MiB; this gives
decent speeds approaching that of webworkers (around 50%)
this is a new take on c06d928bb5
which was removed in 184af0c603
when a chrome-beta temporarily fixed the poor file-read performance
(afaict the fix was reverted and never made it to chrome stable)
as for why any of this is necessary,
the security features in android have the unfortunate side-effect
of making file-reads from web-browsers extremely expensive;
this is especially noticeable in android-chrome, where
file-hashing is painfully slow, around 9 MiB/s worst-case
this is due to a fixed-time overhead for each read operation;
reading 1 MiB takes 60 msec, while reading 16 MiB takes 112 msec
fixes a bug reported on discord;
1. run with `--idp-h-usr=iu -v=srv::A`
2. upload a file with up2k; this succeeds
3. announce an idp user: `curl -Hiu:a 127.1:3923`
4. upload another file; fails with "fs-reload"
the idp announce would `up2k.reload` which raises the
`reload_flag` and `rescan_cond`, but there is nothing
listening on `rescan_cond` because `have_e2d` was false
must assume e2d if idp is enabled, because `have_e2d` will
only be true if there are non-idp volumes with e2d enabled
in case someone writes a plugin which
expects certain params to be sanitized
note that because mojibake filenames are supported,
URLs and filepaths can still be absolutely bonkers
this fixes one known issue:
invalid rss-feed xml if ?pw contains special chars
...and somehow things now run 2% faster, idgi
18:17:26 &ed | what's wrong with it
18:17:38 +Mai | that you don't know it's the volume bar before you try it
18:17:46 &ed | oh
18:17:48 &ed | yeah i guess
18:17:54 +Mai | especially when it's at 100
18:18:00 &ed | how do i fix it tho
18:19:50 +Mai | you could add an icon that's also a mute button (to not make it a useless icon)
18:22:38 &ed | i'll make the volume text always visible and include a speaker icon before it
18:23:53 +Mai | that is better at least
when deleting a folder, any dotfiles/folders within would only
be deleted if the user had the dot-permission to see dotfiles;
this gave the confusing behavior of not removing the "empty"
folders after deleting them
fix this to only require the delete-permission, and always
delete the entire folder, including any dotfiles within
similar behavior would also apply to moves, renames, and copies;
fix moves and renames to only require the move-permission in
the source volume; dotfiles will now always be included,
regardless of whether the user does (or does not) have the
dot-permission in either the source and/or destination volumes
copying folders now also behaves more intuitively: if the user has
the dot-permission in the target volume, then dotfiles will only be
included from source folders where the user also has the dot-perm,
to prevent the user from seeing intentionally hidden files/folders
as processing of a HTTP request begins (GET, HEAD, PUT, POST, ...),
the original query line is printed in its encoded form. This makes
debugging easier, since there is no ambiguity in how the client
phrased its request.
however, this results in very opaque logs for non-ascii languages;
basically a wall of percent-encoded characters. Avoid this issue
by printing an additional log-message if the URL contains `%`,
immediately below the original url-encoded entry.
also fix tests on macos, and an unrelated bad logmsg in up2k
chrome (and chromium-based browsers) can OOM when:
* the OS is Windows, MacOS, or Android (but not Linux?)
* the website is hosted on a remote IP (not localhost)
* webworkers are used to read files
unfortunately this also applies to Android, which heavily relies
on webworkers to make read-speeds anywhere close to acceptable
as for android, there are diminishing returns with more than 4
webworkers (1=1x, 2=2.3x, 3=3.8x, 4=4.2x, 6=4.5x, 8=5.3x), and
limiting the number of workers to ensure at least one idle core
appears to sufficiently reduce the OOM probability
on desktop, webworkers are only necessary for hashwasm, so
limit the number of workers to 2 if crypto.subtle is available
and otherwise use the nproc-1 rule for hashwasm in workers
bug report: https://issues.chromium.org/issues/383568268
if a NIC is brought up with several IPs,
it would only mention one of the new IPs in the logs
or if a PCIe bus crashes and all NICs drop dead,
it would only mention one of the IPs that disappeared
as both scenarios are oddly common, be more verbose
previously, when IdP was enabled, the password-based login would be
entirely disabled. This was a semi-conscious decision, based on the
assumption that you would always want to use IdP after enabling it.
it makes more sense to keep password-based login working as usual,
conditionally disengaging it for requests which contains a valid
IdP username header. This makes it possible to define fallback
users, or API-only users, and all similar escape hatches.
if someone accidentally starts uploading a file in the wrong folder,
it was not obvious that you can forget that upload in the unpost tab
this '(explain)' button in the upload-error hopefully explains that,
and upload immediately commences when the initial attempt is aborted
on the backend, cleanup the dupesched when an upload is
aborted, and save some cpu by adding unique entries only
url-param / header `ck` specifies hashing algo;
md5 sha1 sha256 sha512 b2 blake2 b2s blake2s
value 'no' or blank disables checksumming,
for when copyparty is running on ancient gear
and you don't really care about file integrity