Compare commits

...

15 Commits

Author SHA1 Message Date
ed
a900c36395 v1.3.15 2022-08-18 01:02:19 +02:00
ed
1d9b324d3e explain w/a wasm leaks in workers (chrome bug) 2022-08-18 01:02:06 +02:00
ed
539e7b8efe help chrome gc by reusing one filereader 2022-08-18 00:05:32 +02:00
ed
50a477ee47 up2k-hook-ytid: upload into subdirs by id 2022-08-15 21:52:41 +02:00
ed
7000123a8b v1.3.14 2022-08-15 20:25:31 +02:00
ed
d48a7d2398 provide tagparsers with uploader info 2022-08-15 20:23:17 +02:00
ed
389a00ce59 v1.3.13 2022-08-15 19:11:21 +02:00
ed
7a460de3c2 windows db fix 2022-08-15 18:01:28 +02:00
ed
8ea1f4a751 idx multimedia format/container type 2022-08-15 17:56:13 +02:00
ed
1c69ccc6cd v1.3.12 2022-08-13 00:58:49 +02:00
ed
84b5bbd3b6 u2cli: bail from recursive symlinks + verbose errors 2022-08-13 00:28:08 +02:00
ed
9ccd327298 add directory hashing (boots ~3x faster) 2022-08-12 23:17:18 +02:00
ed
11df36f3cf add option to exit after scanning volumes 2022-08-12 21:20:13 +02:00
ed
f62dd0e3cc support fips-cpython and maybe make-sfx on macos 2022-08-12 16:36:20 +02:00
ed
ad18b6e15e stop reindexing empty files on startup 2022-08-12 16:31:36 +02:00
21 changed files with 441 additions and 80 deletions

View File

@@ -249,12 +249,18 @@ some improvement ideas
* Windows: if the `up2k.db` (filesystem index) is on a samba-share or network disk, you'll get unpredictable behavior if the share is disconnected for a bit
* use `--hist` or the `hist` volflag (`-v [...]:c,hist=/tmp/foo`) to place the db on a local disk instead
* all volumes must exist / be available on startup; up2k (mtp especially) gets funky otherwise
* [the database can get stuck](https://github.com/9001/copyparty/issues/10)
* has only happened once but that is once too many
* luckily not dangerous for file integrity and doesn't really stop uploads or anything like that
* but would really appreciate some logs if anyone ever runs into it again
* probably more, pls let me know
## not my bugs
* [Chrome issue 1317069](https://bugs.chromium.org/p/chromium/issues/detail?id=1317069) -- if you try to upload a folder which contains symlinks by dragging it into the browser, the symlinked files will not get uploaded
* [Chrome issue 1352210](https://bugs.chromium.org/p/chromium/issues/detail?id=1352210) -- plaintext http may be faster at filehashing than https (but also extremely CPU-intensive)
* iPhones: the volume control doesn't work because [apple doesn't want it to](https://developer.apple.com/library/archive/documentation/AudioVideo/Conceptual/Using_HTML5_Audio_Video/Device-SpecificConsiderations/Device-SpecificConsiderations.html#//apple_ref/doc/uid/TP40009523-CH5-SW11)
* *future workaround:* enable the equalizer, make it all-zero, and set a negative boost to reduce the volume
* "future" because `AudioContext` is broken in the current iOS version (15.1), maybe one day...
@@ -1008,6 +1014,10 @@ this is due to `crypto.subtle` [not yet](https://github.com/w3c/webcrypto/issues
as a result, the hashes are much less useful than they could have been (search the server by sha512, provide the sha512 in the response http headers, ...)
however it allows for hashing multiple chunks in parallel, greatly increasing upload speed from fast storage (NVMe, raid-0 and such)
* both the [browser uploader](#uploading) and the [commandline one](https://github.com/9001/copyparty/blob/hovudstraum/bin/up2k.py) does this now, allowing for fast uploading even from plaintext http
hashwasm would solve the streaming issue but reduces hashing speed for sha512 (xxh128 does 6 GiB/s), and it would make old browsers and [iphones](https://bugs.webkit.org/show_bug.cgi?id=228552) unsupported
* blake2 might be a better choice since xxh is non-cryptographic, but that gets ~15 MiB/s on slower androids
@@ -1041,6 +1051,7 @@ when uploading files,
* if you're cpu-bottlenecked, or the browser is maxing a cpu core:
* up to 30% faster uploads if you hide the upload status list by switching away from the `[🚀]` up2k ui-tab (or closing it)
* optionally you can switch to the lightweight potato ui by clicking the `[🥔]`
* switching to another browser-tab also works, the favicon will update every 10 seconds in that case
* unlikely to be a problem, but can happen when uploding many small files, or your internet is too fast, or PC too slow

38
bin/mtag/mousepad.py Normal file
View File

@@ -0,0 +1,38 @@
#!/usr/bin/env python3
import os
import sys
import subprocess as sp
"""
mtp test -- opens a texteditor
usage:
-vsrv/v1:v1:r:c,mte=+x1:c,mtp=x1=ad,p,bin/mtag/mousepad.py
explained:
c,mte: list of tags to index in this volume
c,mtp: add new tag provider
x1: dummy tag to provide
ad: dontcare if audio or not
p: priority 1 (run after initial tag-scan with ffprobe or mutagen)
"""
def main():
env = os.environ.copy()
env["DISPLAY"] = ":0.0"
if False:
# open the uploaded file
fp = sys.argv[-1]
else:
# display stdin contents (`oth_tags`)
fp = "/dev/stdin"
p = sp.Popen(["/usr/bin/mousepad", fp])
p.communicate()
main()

View File

@@ -47,8 +47,8 @@ CONDITIONAL_UPLOAD = True
def main():
fp = sys.argv[1]
if CONDITIONAL_UPLOAD:
fp = sys.argv[1]
zb = sys.stdin.buffer.read()
zs = zb.decode("utf-8", "replace")
md = json.loads(zs)

View File

@@ -97,7 +97,7 @@ def main():
zs = (
"ffmpeg -y -hide_banner -nostdin -v warning"
+ " -err_detect +crccheck+bitstream+buffer+careful+compliant+aggressive+explode"
" -xerror -i"
+ " -xerror -i"
)
cmd = zs.encode("ascii").split(b" ") + [fsenc(fp)]

View File

@@ -3,7 +3,7 @@ from __future__ import print_function, unicode_literals
"""
up2k.py: upload to copyparty
2022-08-10, v0.17, ed <irc.rizon.net>, MIT-Licensed
2022-08-13, v0.18, ed <irc.rizon.net>, MIT-Licensed
https://github.com/9001/copyparty/blob/hovudstraum/bin/up2k.py
- dependencies: requests
@@ -330,8 +330,8 @@ def _scd(err, top):
abspath = os.path.join(top, fh.name)
try:
yield [abspath, fh.stat()]
except:
err.append(abspath)
except Exception as ex:
err.append((abspath, str(ex)))
def _lsd(err, top):
@@ -340,8 +340,8 @@ def _lsd(err, top):
abspath = os.path.join(top, name)
try:
yield [abspath, os.stat(abspath)]
except:
err.append(abspath)
except Exception as ex:
err.append((abspath, str(ex)))
if hasattr(os, "scandir"):
@@ -350,15 +350,21 @@ else:
statdir = _lsd
def walkdir(err, top):
def walkdir(err, top, seen):
"""recursive statdir"""
atop = os.path.abspath(os.path.realpath(top))
if atop in seen:
err.append((top, "recursive-symlink"))
return
seen = seen[:] + [atop]
for ap, inf in sorted(statdir(err, top)):
if stat.S_ISDIR(inf.st_mode):
try:
for x in walkdir(err, ap):
for x in walkdir(err, ap, seen):
yield x
except:
err.append(ap)
except Exception as ex:
err.append((ap, str(ex)))
else:
yield ap, inf
@@ -373,7 +379,7 @@ def walkdirs(err, tops):
stop = os.path.dirname(top)
if os.path.isdir(top):
for ap, inf in walkdir(err, top):
for ap, inf in walkdir(err, top, []):
yield stop, ap[len(stop) :].lstrip(sep), inf
else:
d, n = top.rsplit(sep, 1)
@@ -576,12 +582,19 @@ class Ctl(object):
if err:
eprint("\n# failed to access {0} paths:\n".format(len(err)))
for x in err:
eprint(x.decode("utf-8", "replace") + "\n")
for ap, msg in err:
if ar.v:
eprint("{0}\n `-{1}\n\n".format(ap.decode("utf-8", "replace"), msg))
else:
eprint(ap.decode("utf-8", "replace") + "\n")
eprint("^ failed to access those {0} paths ^\n\n".format(len(err)))
if not ar.v:
eprint("hint: set -v for detailed error messages\n")
if not ar.ok:
eprint("aborting because --ok is not set\n")
eprint("hint: aborting because --ok is not set\n")
return
eprint("found {0} files, {1}\n\n".format(nfiles, humansize(nbytes)))
@@ -929,6 +942,7 @@ source file/folder selection uses rsync syntax, meaning that:
ap.add_argument("url", type=unicode, help="server url, including destination folder")
ap.add_argument("files", type=unicode, nargs="+", help="files and/or folders to process")
ap.add_argument("-v", action="store_true", help="verbose")
ap.add_argument("-a", metavar="PASSWORD", help="password")
ap.add_argument("-s", action="store_true", help="file-search (disables upload)")
ap.add_argument("--ok", action="store_true", help="continue even if some local files are inaccessible")

View File

@@ -51,6 +51,8 @@ async function a_up2k_namefilter(good_files, nil_files, bad_files, hooks) {
cname = name, // will clobber
sz = fobj.size,
ids = [],
fn_ids = [],
md_ids = [],
id_ok = false,
m;
@@ -71,7 +73,7 @@ async function a_up2k_namefilter(good_files, nil_files, bad_files, hooks) {
cname = cname.replace(m[1], '');
yt_ids.add(m[1]);
ids.push(m[1]);
fn_ids.unshift(m[1]);
}
// look for IDs in video metadata,
@@ -110,10 +112,13 @@ async function a_up2k_namefilter(good_files, nil_files, bad_files, hooks) {
console.log(`found ${m} @${bofs}, ${name} `);
yt_ids.add(m);
if (!has(ids, m)) {
ids.push(m);
if (!has(fn_ids, m) && !has(md_ids, m)) {
md_ids.push(m);
md_only.push(`${m} ${name}`);
}
else
// id appears several times; make it preferred
md_ids.unshift(m);
// bail after next iteration
chunk = nchunks - 1;
@@ -130,6 +135,13 @@ async function a_up2k_namefilter(good_files, nil_files, bad_files, hooks) {
}
}
}
for (var yi of md_ids)
ids.push(yi);
for (var yi of fn_ids)
if (!has(ids, yi))
ids.push(yi);
}
if (md_only.length)
@@ -164,6 +176,7 @@ async function a_up2k_namefilter(good_files, nil_files, bad_files, hooks) {
function process_id_list(txt) {
var wanted_ids = new Set(txt.trim().split('\n')),
name_id = {},
wanted_names = new Set(), // basenames with a wanted ID
wanted_files = new Set(); // filedrops
@@ -174,8 +187,11 @@ async function a_up2k_namefilter(good_files, nil_files, bad_files, hooks) {
wanted_files.add(good_files[a]);
var m = /(.*)\.(mp4|webm|mkv|flv|opus|ogg|mp3|m4a|aac)$/i.exec(name);
if (m)
wanted_names.add(m[1]);
if (!m)
continue;
wanted_names.add(m[1]);
name_id[m[1]] = file_ids[a][b];
break;
}
@@ -189,6 +205,9 @@ async function a_up2k_namefilter(good_files, nil_files, bad_files, hooks) {
name = name.replace(/\.[^\.]+$/, '');
if (wanted_names.has(name)) {
wanted_files.add(good_files[a]);
var subdir = `${name_id[name]}-${Date.now()}-${a}`;
good_files[a][1] = subdir + '/' + good_files[a][1].split(/\//g).pop();
break;
}
}

View File

@@ -335,7 +335,7 @@ def run_argparse(argv: list[str], formatter: Any, retry: bool) -> argparse.Names
except:
fk_salt = "hunter2"
cores = os.cpu_count() if hasattr(os, "cpu_count") else 4
cores = (os.cpu_count() if hasattr(os, "cpu_count") else 0) or 4
hcores = min(cores, 3) # 4% faster than 4+ on py3.9 @ r5-4500U
sects = [
@@ -552,9 +552,10 @@ def run_argparse(argv: list[str], formatter: Any, retry: bool) -> argparse.Names
ap2.add_argument("--no-robots", action="store_true", help="adds http and html headers asking search engines to not index anything")
ap2.add_argument("--logout", metavar="H", type=float, default="8086", help="logout clients after H hours of inactivity (0.0028=10sec, 0.1=6min, 24=day, 168=week, 720=month, 8760=year)")
ap2 = ap.add_argument_group('yolo options')
ap2 = ap.add_argument_group('shutdown options')
ap2.add_argument("--ign-ebind", action="store_true", help="continue running even if it's impossible to listen on some of the requested endpoints")
ap2.add_argument("--ign-ebind-all", action="store_true", help="continue running even if it's impossible to receive connections at all")
ap2.add_argument("--exit", metavar="WHEN", type=u, default="", help="shutdown after WHEN has finished; for example 'idx' will do volume indexing + metadata analysis")
ap2 = ap.add_argument_group('logging options')
ap2.add_argument("-q", action="store_true", help="quiet")
@@ -610,6 +611,7 @@ def run_argparse(argv: list[str], formatter: Any, retry: bool) -> argparse.Names
ap2.add_argument("--hist", metavar="PATH", type=u, help="where to store volume data (db, thumbs)")
ap2.add_argument("--no-hash", metavar="PTN", type=u, help="regex: disable hashing of matching paths during e2ds folder scans")
ap2.add_argument("--no-idx", metavar="PTN", type=u, help="regex: disable indexing of matching paths during e2ds folder scans")
ap2.add_argument("--no-dhash", action="store_true", help="disable rescan acceleration; do full database integrity check -- makes the db ~5%% smaller and bootup/rescans 3~10x slower")
ap2.add_argument("--xdev", action="store_true", help="do not descend into other filesystems (symlink or bind-mount to another HDD, ...)")
ap2.add_argument("--xvol", action="store_true", help="skip symlinks leaving the volume root")
ap2.add_argument("--hash-mt", metavar="CORES", type=int, default=hcores, help="num cpu cores to use for file hashing; set 0 or 1 for single-core hashing")
@@ -628,9 +630,9 @@ def run_argparse(argv: list[str], formatter: Any, retry: bool) -> argparse.Names
ap2.add_argument("--mtag-v", action="store_true", help="verbose tag scanning; print errors from mtp subprocesses and such")
ap2.add_argument("-mtm", metavar="M=t,t,t", type=u, action="append", help="add/replace metadata mapping")
ap2.add_argument("-mte", metavar="M,M,M", type=u, help="tags to index/display (comma-sep.)",
default="circle,album,.tn,artist,title,.bpm,key,.dur,.q,.vq,.aq,vc,ac,res,.fps,ahash,vhash")
default="circle,album,.tn,artist,title,.bpm,key,.dur,.q,.vq,.aq,vc,ac,fmt,res,.fps,ahash,vhash")
ap2.add_argument("-mth", metavar="M,M,M", type=u, help="tags to hide by default (comma-sep.)",
default=".vq,.aq,vc,ac,res,.fps")
default=".vq,.aq,vc,ac,fmt,res,.fps")
ap2.add_argument("-mtp", metavar="M=[f,]BIN", type=u, action="append", help="read tag M using program BIN to parse the file")
ap2 = ap.add_argument_group('ui options')

View File

@@ -1,8 +1,8 @@
# coding: utf-8
VERSION = (1, 3, 11)
VERSION = (1, 3, 15)
CODENAME = "god dag"
BUILD_DT = (2022, 8, 10)
BUILD_DT = (2022, 8, 18)
S_VERSION = ".".join(map(str, VERSION))
S_BUILD_DT = "{0:04d}-{1:02d}-{2:02d}".format(*BUILD_DT)

View File

@@ -15,7 +15,7 @@ class Ico(object):
def get(self, ext: str, as_thumb: bool) -> tuple[str, bytes]:
"""placeholder to make thumbnails not break"""
zb = hashlib.md5(ext.encode("utf-8")).digest()[:2]
zb = hashlib.sha1(ext.encode("utf-8")).digest()[2:4]
if PY2:
zb = [ord(x) for x in zb]

View File

@@ -178,7 +178,7 @@ def parse_ffprobe(txt: str) -> tuple[dict[str, tuple[int, Any]], dict[str, list[
]
if typ == "format":
kvm = [["duration", ".dur"], ["bit_rate", ".q"]]
kvm = [["duration", ".dur"], ["bit_rate", ".q"], ["format_name", "fmt"]]
for sk, rk in kvm:
v1 = strm.get(sk)
@@ -239,6 +239,9 @@ def parse_ffprobe(txt: str) -> tuple[dict[str, tuple[int, Any]], dict[str, list[
if ".q" in ret:
del ret[".q"]
if "fmt" in ret:
ret["fmt"] = ret["fmt"].split(",")[0]
if ".resw" in ret and ".resh" in ret:
ret["res"] = "{}x{}".format(ret[".resw"], ret[".resh"])

View File

@@ -206,6 +206,9 @@ class SvcHub(object):
self.log("root", t, 1)
self.retcode = 1
self.sigterm()
def sigterm(self) -> None:
os.kill(os.getpid(), signal.SIGTERM)
def cb_httpsrv_up(self) -> None:

View File

@@ -46,6 +46,7 @@ from .util import (
s3enc,
sanitize_fn,
statdir,
vjoin,
vsplit,
w8b64dec,
w8b64enc,
@@ -124,7 +125,7 @@ class Up2k(object):
self.mtp_parsers: dict[str, dict[str, MParser]] = {}
self.pending_tags: list[tuple[set[str], str, str, dict[str, Any]]] = []
self.hashq: Queue[tuple[str, str, str, str, float]] = Queue()
self.tagq: Queue[tuple[str, str, str, str]] = Queue()
self.tagq: Queue[tuple[str, str, str, str, str, float]] = Queue()
self.tag_event = threading.Condition()
self.n_hashq = 0
self.n_tagq = 0
@@ -182,6 +183,9 @@ class Up2k(object):
all_vols = self.asrv.vfs.all_vols
have_e2d = self.init_indexes(all_vols, [])
if not self.pp and self.args.exit == "idx":
return self.hub.sigterm()
thr = threading.Thread(target=self._snapshot, name="up2k-snapshot")
thr.daemon = True
thr.start()
@@ -571,7 +575,6 @@ class Up2k(object):
t = "online (running mtp)"
if scan_vols:
thr = threading.Thread(target=self._run_all_mtp, name="up2k-mtp-scan")
thr.daemon = True
else:
self.pp = None
t = "online, idle"
@@ -580,6 +583,7 @@ class Up2k(object):
self.volstate[vol.vpath] = t
if thr:
thr.daemon = True
thr.start()
return have_e2d
@@ -730,6 +734,13 @@ class Up2k(object):
if db.n:
self.log("commit {} new files".format(db.n))
if self.args.no_dhash:
if db.c.execute("select d from dh").fetchone():
db.c.execute("delete from dh")
self.log("forgetting dhashes in {}".format(top))
elif n_add or n_rm:
self._set_tagscan(db.c, True)
db.c.connection.commit()
return True, bool(n_add or n_rm or do_vac)
@@ -748,7 +759,7 @@ class Up2k(object):
xvol: bool,
) -> int:
if xvol and not rcdir.startswith(top):
self.log("skip xvol: [{}] -> [{}]".format(top, rcdir), 6)
self.log("skip xvol: [{}] -> [{}]".format(cdir, rcdir), 6)
return 0
if rcdir in seen:
@@ -756,29 +767,32 @@ class Up2k(object):
self.log(t.format(seen[-1], rcdir, cdir), 3)
return 0
ret = 0
seen = seen + [rcdir]
unreg: list[str] = []
files: list[tuple[int, int, str]] = []
assert self.pp and self.mem_cur
self.pp.msg = "a{} {}".format(self.pp.n, cdir)
ret = 0
unreg: list[str] = []
seen_files = {} # != inames; files-only for dropcheck
rd = cdir[len(top) :].strip("/")
if WINDOWS:
rd = rd.replace("\\", "/").strip("/")
g = statdir(self.log_func, not self.args.no_scandir, False, cdir)
gl = sorted(g)
inames = {x[0]: 1 for x in gl}
partials = set([x[0] for x in gl if "PARTIAL" in x[0]])
for iname, inf in gl:
if self.stop:
return -1
rp = vjoin(rd, iname)
abspath = os.path.join(cdir, iname)
rp = abspath[len(top) :].lstrip("/")
if WINDOWS:
rp = rp.replace("\\", "/").strip("/")
if rei and rei.search(abspath):
unreg.append(rp)
continue
nohash = reh.search(abspath) if reh else False
lmod = int(inf.st_mtime)
sz = inf.st_size
if stat.S_ISDIR(inf.st_mode):
@@ -804,19 +818,53 @@ class Up2k(object):
self.log("skip type-{:x} file [{}]".format(inf.st_mode, abspath))
else:
# self.log("file: {}".format(abspath))
seen_files[iname] = 1
if rp.endswith(".PARTIAL") and time.time() - lmod < 60:
# rescan during upload
continue
if not sz and (
"{}.PARTIAL".format(iname) in inames
or ".{}.PARTIAL".format(iname) in inames
"{}.PARTIAL".format(iname) in partials
or ".{}.PARTIAL".format(iname) in partials
):
# placeholder for unfinished upload
continue
rd, fn = rp.rsplit("/", 1) if "/" in rp else ["", rp]
files.append((sz, lmod, iname))
# folder of 1000 files = ~1 MiB RAM best-case (tiny filenames);
# free up stuff we're done with before dhashing
gl = []
partials.clear()
if not self.args.no_dhash:
if len(files) < 9000:
zh = hashlib.sha1(str(files).encode("utf-8", "replace"))
else:
zh = hashlib.sha1()
_ = [zh.update(str(x).encode("utf-8", "replace")) for x in files]
dhash = base64.urlsafe_b64encode(zh.digest()[:12]).decode("ascii")
sql = "select d from dh where d = ? and h = ?"
try:
c = db.c.execute(sql, (rd, dhash))
drd = rd
except:
drd = "//" + w8b64enc(rd)
c = db.c.execute(sql, (drd, dhash))
if c.fetchone():
return ret
seen_files = set([x[2] for x in files]) # for dropcheck
for sz, lmod, fn in files:
if self.stop:
return -1
rp = vjoin(rd, fn)
abspath = os.path.join(cdir, fn)
nohash = reh.search(abspath) if reh else False
if fn: # diff-golf
sql = "select w, mt, sz from up where rd = ? and fn = ?"
try:
c = db.c.execute(sql, (rd, fn))
@@ -833,7 +881,7 @@ class Up2k(object):
self.log(t.format(top, rp, len(in_db), rep_db))
dts = -1
if dts == lmod and dsz == sz and (nohash or dw[0] != "#"):
if dts == lmod and dsz == sz and (nohash or dw[0] != "#" or not sz):
continue
t = "reindex [{}] => [{}] ({}/{}) ({}/{})".format(
@@ -876,6 +924,10 @@ class Up2k(object):
db.n = 0
db.t = time.time()
if not self.args.no_dhash:
db.c.execute("delete from dh where d = ?", (drd,))
db.c.execute("insert into dh values (?,?)", (drd, dhash))
if self.stop:
return -1
@@ -894,15 +946,14 @@ class Up2k(object):
t = "forgetting {} shadowed autoindexed files in [{}] > [{}]"
self.log(t.format(n, top, rd))
q = "delete from dh where (d = ? or d like ?||'%')"
db.c.execute(q, (erd, erd + "/"))
q = "delete from up where (rd = ? or rd like ?||'%') and at == 0"
db.c.execute(q, (erd, erd + "/"))
ret += n
# drop missing files
rd = cdir[len(top) + 1 :].strip("/")
if WINDOWS:
rd = rd.replace("\\", "/").strip("/")
q = "select fn from up where rd = ?"
try:
c = db.c.execute(q, (rd,))
@@ -953,6 +1004,7 @@ class Up2k(object):
self.log("forgetting {} deleted dirs, {} files".format(len(rm), n_rm))
for rd in rm:
cur.execute("delete from dh where d = ?", (rd,))
cur.execute("delete from up where rd = ?", (rd,))
# then shadowed deleted files
@@ -1114,10 +1166,43 @@ class Up2k(object):
reg = self.register_vpath(ptop, vol.flags)
assert reg and self.pp
cur = self.cur[ptop]
if not self.args.no_dhash:
with self.mutex:
c = cur.execute("select k from kv where k = 'tagscan'")
if not c.fetchone():
return 0, 0, bool(self.mtag)
ret = self._build_tags_index_2(ptop)
with self.mutex:
self._set_tagscan(cur, False)
cur.connection.commit()
return ret
def _set_tagscan(self, cur: "sqlite3.Cursor", need: bool) -> bool:
if self.args.no_dhash:
return False
c = cur.execute("select k from kv where k = 'tagscan'")
if bool(c.fetchone()) == need:
return False
if need:
cur.execute("insert into kv values ('tagscan',1)")
else:
cur.execute("delete from kv where k = 'tagscan'")
return True
def _build_tags_index_2(self, ptop: str) -> tuple[int, int, bool]:
entags = self.entags[ptop]
flags = self.flags[ptop]
cur = self.cur[ptop]
n_add = 0
n_rm = 0
if "e2tsr" in flags:
with self.mutex:
@@ -1203,8 +1288,8 @@ class Up2k(object):
with self.mutex:
try:
q = "select rd, fn from up where substr(w,1,16)=? and +w=?"
rd, fn = cur.execute(q, (w[:16], w)).fetchone()
q = "select rd, fn, ip, at from up where substr(w,1,16)=? and +w=?"
rd, fn, ip, at = cur.execute(q, (w[:16], w)).fetchone()
except:
# file modified/deleted since spooling
continue
@@ -1219,9 +1304,14 @@ class Up2k(object):
abspath = os.path.join(ptop, rd, fn)
self.pp.msg = "c{} {}".format(nq, abspath)
if not mpool:
n_tags = self._tagscan_file(cur, entags, w, abspath)
n_tags = self._tagscan_file(cur, entags, w, abspath, ip, at)
else:
mpool.put(Mpqe({}, entags, w, abspath, {}))
if ip:
oth_tags = {"up_ip": ip, "up_at": at}
else:
oth_tags = {}
mpool.put(Mpqe({}, entags, w, abspath, oth_tags))
with self.mutex:
n_tags = len(self._flush_mpool(cur))
@@ -1313,6 +1403,9 @@ class Up2k(object):
if "OFFLINE" not in self.volstate[k]:
self.volstate[k] = "online, idle"
if self.args.exit == "idx":
self.hub.sigterm()
def _run_one_mtp(self, ptop: str, gid: int) -> None:
if gid != self.gid:
return
@@ -1361,8 +1454,8 @@ class Up2k(object):
if w in in_progress:
continue
q = "select rd, fn from up where substr(w,1,16)=? limit 1"
rd, fn = cur.execute(q, (w,)).fetchone()
q = "select rd, fn, ip, at from up where substr(w,1,16)=? limit 1"
rd, fn, ip, at = cur.execute(q, (w,)).fetchone()
rd, fn = s3dec(rd, fn)
abspath = os.path.join(ptop, rd, fn)
@@ -1384,6 +1477,10 @@ class Up2k(object):
else:
oth_tags = {}
if ip:
oth_tags["up_ip"] = ip
oth_tags["up_at"] = at
jobs.append(Mpqe(parsers, set(), w, abspath, oth_tags))
in_progress[w] = True
@@ -1553,6 +1650,8 @@ class Up2k(object):
entags: set[str],
wark: str,
abspath: str,
ip: str,
at: float
) -> int:
"""will mutex"""
assert self.mtag
@@ -1566,6 +1665,10 @@ class Up2k(object):
self._log_tag_err("", abspath, ex)
return 0
if ip:
tags["up_ip"] = ip
tags["up_at"] = at
with self.mutex:
return self._tag_file(write_cur, entags, wark, abspath, tags)
@@ -1605,6 +1708,7 @@ class Up2k(object):
write_cur.execute(q, (wark[:16], k, v))
ret += 1
self._set_tagscan(write_cur, True)
return ret
def _orz(self, db_path: str) -> "sqlite3.Cursor":
@@ -1628,6 +1732,11 @@ class Up2k(object):
self.log("WARN: failed to upgrade from v4", 3)
if ver == DB_VER:
try:
self._add_dhash_tab(cur)
except:
pass
try:
nfiles = next(cur.execute("select count(w) from up"))[0]
self.log("OK: {} |{}|".format(db_path, nfiles))
@@ -1716,7 +1825,7 @@ class Up2k(object):
]:
cur.execute(cmd)
cur.connection.commit()
self._add_dhash_tab(cur)
self.log("created DB at {}".format(db_path))
return cur
@@ -1731,6 +1840,17 @@ class Up2k(object):
cur.connection.commit()
def _add_dhash_tab(self, cur: "sqlite3.Cursor") -> None:
# v5 -> v5a
for cmd in [
r"create table dh (d text, h text)",
r"create index dh_d on dh(d)",
r"insert into kv values ('tagscan',1)",
]:
cur.execute(cmd)
cur.connection.commit()
def _job_volchk(self, cj: dict[str, Any]) -> None:
if not self.register_vpath(cj["ptop"], cj["vcfg"]):
if cj["ptop"] not in self.registry:
@@ -2190,7 +2310,7 @@ class Up2k(object):
raise
if "e2t" in self.flags[ptop]:
self.tagq.put((ptop, wark, rd, fn))
self.tagq.put((ptop, wark, rd, fn, ip, at))
self.n_tagq += 1
return True
@@ -2836,7 +2956,7 @@ class Up2k(object):
with self.mutex:
self.n_tagq -= 1
ptop, wark, rd, fn = self.tagq.get()
ptop, wark, rd, fn, ip, at = self.tagq.get()
if "e2t" not in self.flags[ptop]:
continue
@@ -2847,6 +2967,8 @@ class Up2k(object):
ntags1 = len(tags)
parsers = self._get_parsers(ptop, tags, abspath)
if parsers:
tags["up_ip"] = ip
tags["up_at"] = at
tags.update(self.mtag.get_bin(parsers, abspath, tags))
except Exception as ex:
self._log_tag_err("", abspath, ex)

View File

@@ -1327,6 +1327,10 @@ def vsplit(vpath: str) -> tuple[str, str]:
return vpath.rsplit("/", 1) # type: ignore
def vjoin(rd: str, fn: str) -> str:
return rd + "/" + fn if rd else fn
def w8dec(txt: bytes) -> str:
"""decodes filesystem-bytes to wtf8"""
if PY2:

View File

@@ -11,6 +11,7 @@ var Ls = {
"q": "quality / bitrate",
"Ac": "audio codec",
"Vc": "video codec",
"Fmt": "format / container",
"Ahash": "audio checksum",
"Vhash": "video checksum",
"Res": "resolution",
@@ -317,6 +318,7 @@ var Ls = {
"u_ehssrch": "server rejected the request to perform search",
"u_ehsinit": "server rejected the request to initiate upload",
"u_ehsdf": "server ran out of disk space!\n\nwill keep retrying, in case someone\nfrees up enough space to continue",
"u_emtleak": "it looks like your webbrowser may have a memory leak;\nplease try the following:\n<ul><li>hit <code>F5</code> to refresh the page</li><li>then disable the &nbsp;<code>mt</code>&nbsp; button in the &nbsp;<code>⚙️ settings</code></li><li>and try that upload again</li></ul>Uploads will be a bit slower, but oh well.\nSorry for the trouble!",
"u_s404": "not found on server",
"u_expl": "explain",
"u_tu": '<p class="warn">WARNING: turbo enabled, <span>&nbsp;client may not detect and resume incomplete uploads; see turbo-button tooltip</span></p>',
@@ -348,6 +350,7 @@ var Ls = {
"q": "kvalitet / bitrate",
"Ac": "lyd-format",
"Vc": "video-format",
"Fmt": "format / innpakning",
"Ahash": "lyd-kontrollsum",
"Vhash": "video-kontrollsum",
"Res": "oppløsning",
@@ -654,6 +657,7 @@ var Ls = {
"u_ehssrch": "server nektet forespørselen om å utføre søk",
"u_ehsinit": "server nektet forespørselen om å begynne en ny opplastning",
"u_ehsdf": "serveren er full!\n\nprøver igjen regelmessig,\ni tilfelle noen rydder litt...",
"u_emtleak": "uff, det er mulig at nettleseren din har en minnelekkasje...\nForeslår at du prøver følgende:\n<ul><li>trykk F5 for å laste siden på nytt</li><li>så skru av &nbsp;<code>mt</code>&nbsp; bryteren under &nbsp;<code>⚙️ innstillinger</code></li><li>og forsøk den samme opplastningen igjen</li></ul>Opplastning vil gå litt tregere, men det får så være.\nBeklager bryderiet!",
"u_s404": "ikke funnet på serveren",
"u_expl": "forklar",
"u_tu": '<p class="warn">ADVARSEL: turbo er på, <span>&nbsp;avbrutte opplastninger vil muligens ikke oppdages og gjenopptas; hold musepekeren over turbo-knappen for mer info</span></p>',

View File

@@ -847,6 +847,7 @@ function up2k_init(subtle) {
},
"car": 0,
"slow_io": null,
"oserr": false,
"modn": 0,
"modv": 0,
"mod0": null
@@ -1365,6 +1366,14 @@ function up2k_init(subtle) {
etaskip = 0;
}
function got_oserr() {
if (!hws.length || !uc.hashw || st.oserr)
return;
st.oserr = true;
modal.alert(L.u_emtleak);
}
/////
////
/// actuator
@@ -1723,6 +1732,7 @@ function up2k_init(subtle) {
pvis.seth(t.n, 2, err + ' @ ' + car);
console.log('OS-error', reader.error, '@', car);
handled = true;
got_oserr();
}
if (handled) {
@@ -1841,6 +1851,8 @@ function up2k_init(subtle) {
pvis.seth(t.n, 1, d[1]);
pvis.seth(t.n, 2, d[2]);
console.log(d[1], d[2]);
if (d[1] == 'OS-error')
got_oserr();
pvis.move(t.n, 'ng');
apop(st.busy.hash, t);

View File

@@ -8,7 +8,7 @@ function hex2u8(txt) {
var subtle = null;
try {
subtle = crypto.subtle || crypto.webkitSubtle;
subtle = crypto.subtle;
subtle.digest('SHA-512', new Uint8Array(1)).then(
function (x) { },
function (x) { load_fb(); }
@@ -23,11 +23,20 @@ function load_fb() {
}
var reader = null,
busy = false;
onmessage = (d) => {
var [nchunk, fobj, car, cdr] = d.data,
t0 = Date.now(),
if (busy)
return postMessage(["panic", 'worker got another task while busy']);
if (!reader)
reader = new FileReader();
var [nchunk, fobj, car, cdr] = d.data,
t0 = Date.now();
reader.onload = function (e) {
try {
//console.log('[ w] %d HASH bgin', nchunk);
@@ -39,6 +48,7 @@ onmessage = (d) => {
}
};
reader.onerror = function () {
busy = false;
var err = reader.error + '';
if (err.indexOf('NotReadableError') !== -1 || // win10-chrome defender
@@ -49,12 +59,14 @@ onmessage = (d) => {
postMessage(["ferr", err]);
};
//console.log('[ w] %d read bgin', nchunk);
busy = true;
reader.readAsArrayBuffer(
File.prototype.slice.call(fobj, car, cdr));
var hash_calc = function (buf) {
var hash_done = function (hashbuf) {
busy = false;
try {
var hslice = new Uint8Array(hashbuf).subarray(0, 33);
//console.log('[ w] %d HASH DONE', nchunk);

View File

@@ -1,3 +1,104 @@
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
# 2022-0815-1825 `v1.3.14` fix windows db
after two exciting releases, time for something boring
* read-only demo server at https://a.ocv.me/pub/demo/
* latest gzip edition of the sfx: [v1.0.14](https://github.com/9001/copyparty/releases/tag/v1.0.14#:~:text=release-specific%20notes)
## new features
* upload-info (ip and timestamp) is provided to `mtp` tagparser plugins as json
* tagscanner will index `fmt` (file-format / container type) by default
* and `description` can be enabled in `-mte`
## bugfixes
* [v1.3.12](https://github.com/9001/copyparty/releases/tag/v1.3.12) broke file-indexing on windows if an entire HDD was mounted as a volume
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
# 2022-0812-2258 `v1.3.12` quickboot
* read-only demo server at https://a.ocv.me/pub/demo/
* latest gzip edition of the sfx: [v1.0.14](https://github.com/9001/copyparty/releases/tag/v1.0.14#:~:text=release-specific%20notes)
## new features
*but wait, there's more!*   not only do you get the [multithreaded file hashing](https://github.com/9001/copyparty/releases/tag/v1.3.11) but also --
* faster bootup and volume reindexing when `-e2ds` (file indexing) is enabled
* `3x` faster is probably the average on most instances; more files per folder = faster
* `9x` faster on a 36 TiB zfs music/media nas with `-e2ts` (metadata indexing), dropping from 46sec to 5sec
* and `34x` on another zfs box, 63sec -> 1.8sec
* new arg `--no-dhash` disables the speedhax in case it's buggy (skipping files or audio tags)
* add option `--exit idx` to abort and shutdown after volume indexing has finished
## bugfixes
* [u2cli](https://github.com/9001/copyparty/tree/hovudstraum/bin#up2kpy): detect and skip uploading from recursive symlinks
* stop reindexing empty files on startup
* support fips-compliant cpython builds
* replaces md5 with sha1, changing the filetype-associated colors in the gallery view
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
# 2022-0810-2135 `v1.3.11` webworkers
* read-only demo server at https://a.ocv.me/pub/demo/
* latest gzip edition of the sfx: [v1.0.14](https://github.com/9001/copyparty/releases/tag/v1.0.14#:~:text=release-specific%20notes)
## new features
* multithreaded file hashing! **300%** average speed increase
* when uploading files through the browser client, based on web-workers
* `4.5x` faster on http from a laptop -- `146` -> `670` MiB/s
* ` 30%` faster on https from a laptop -- `552` -> `716` MiB/s
* `4.2x` faster on http from android -- `13.5` -> `57.1` MiB/s
* `5.3x` faster on https from android -- `13.8` -> `73.3` MiB/s
* can be disabled using the `mt` togglebtn in the settings pane, for example if your phone runs out of memory (it eats ~250 MiB extra RAM)
* `2.3x` faster [u2cli](https://github.com/9001/copyparty/tree/hovudstraum/bin#up2kpy) (cmd-line client) -- `398` -> `930` MiB/s
* `2.4x` faster filesystem indexing on the server
* thx to @kipukun for the webworker suggestion!
## bugfixes
* ux: reset scroll when navigating into a new folder
* u2cli: better errormsg if the server's tls certificate got rejected
* js: more futureproof cloudflare-challenge detection (they got a new one recently)
## other changes
* print warning if the python interpreter was built with an unsafe sqlite
* u2cli: add helpful messages on how to make it run on python 2.6
**trivia:** due to a [chrome bug](https://bugs.chromium.org/p/chromium/issues/detail?id=1352210), http can sometimes be faster than https now ¯\\\_(ツ)\_/¯
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
# 2022-0803-2340 `v1.3.10` folders first
* read-only demo server at https://a.ocv.me/pub/demo/
* latest gzip edition of the sfx: [v1.0.14](https://github.com/9001/copyparty/releases/tag/v1.0.14#:~:text=release-specific%20notes)
## new features
* faster
* tag scanner
* on windows: uploading to fat32 or smb
* toggle-button to sort folders before files (default-on)
* almost the same as before, but now also when sorting by size / date
* repeatedly hit `ctrl-c` to force-quit if everything dies
* new file-indexing guards
* `--xdev` / volflag `:c,xdev` stops if it hits another filesystem (bindmount/symlink)
* `--xvol` / volflag `:c,xvol` does not follow symlinks pointing outside the volume
* only affects file indexing -- does NOT prevent access!
## bugfixes
* forget uploads that failed to initialize (allows retry in another folder)
* wrong filekeys in upload response if volume path contained a symlink
* faster shutdown on `ctrl-c` while hashing huge files
* ux: fix navpane covering files on horizontal scroll
## other changes
* include version info in the base64 crash-message
* ux: make upload errors more visible on mobile
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
# 2022-0727-1407 `v1.3.8` more async

View File

@@ -14,10 +14,6 @@ gtar=$(command -v gtar || command -v gnutar) || true
realpath() { grealpath "$@"; }
}
which md5sum 2>/dev/null >/dev/null &&
md5sum=md5sum ||
md5sum="md5 -r"
mode="$1"
[ -z "$mode" ] &&

View File

@@ -69,6 +69,9 @@ pybin=$(command -v python3 || command -v python) || {
exit 1
}
[ $CSN ] ||
CSN=sfx
langs=
use_gz=
zopf=2560
@@ -99,9 +102,9 @@ stamp=$(
done | sort | tail -n 1 | sha1sum | cut -c-16
)
rm -rf sfx/*
mkdir -p sfx build
cd sfx
rm -rf $CSN/*
mkdir -p $CSN build
cd $CSN
tmpdir="$(
printf '%s\n' "$TMPDIR" /tmp |
@@ -237,7 +240,7 @@ ts=$(date -u +%s)
hts=$(date -u +%Y-%m%d-%H%M%S) # --date=@$ts (thx osx)
mkdir -p ../dist
sfx_out=../dist/copyparty-sfx
sfx_out=../dist/copyparty-$CSN
echo cleanup
find -name '*.pyc' -delete
@@ -371,7 +374,7 @@ gzres() {
}
zdir="$tmpdir/cpp-mksfx"
zdir="$tmpdir/cpp-mk$CSN"
[ -e "$zdir/$stamp" ] || rm -rf "$zdir"
mkdir -p "$zdir"
echo a > "$zdir/$stamp"
@@ -402,8 +405,8 @@ sed -r 's/(.*)\.(.*)/\2 \1/' | LC_ALL=C sort |
sed -r 's/([^ ]*) (.*)/\2.\1/' | grep -vE '/list1?$' > list1
for n in {1..50}; do
(grep -vE '\.(gz|br)$' list1; grep -E '\.(gz|br)$' list1 | shuf) >list || true
s=$(md5sum list | cut -c-16)
(grep -vE '\.(gz|br)$' list1; grep -E '\.(gz|br)$' list1 | (shuf||gshuf) ) >list || true
s=$( (sha1sum||shasum) < list | cut -c-16)
grep -q $s "$zdir/h" && continue
echo $s >> "$zdir/h"
break
@@ -423,7 +426,7 @@ pe=bz2
echo compressing tar
# detect best level; bzip2 -7 is usually better than -9
for n in {2..9}; do cp tar t.$n; $pc -$n t.$n & done; wait; mv -v $(ls -1S t.*.$pe | tail -n 1) tar.bz2
for n in {2..9}; do cp tar t.$n; nice $pc -$n t.$n & done; wait; mv -v $(ls -1S t.*.$pe | tail -n 1) tar.bz2
rm t.* || true
exts=()

View File

@@ -1,6 +1,8 @@
#!/bin/bash
set -e
parallel=2
cd ~/dev/copyparty/scripts
v=$1
@@ -21,16 +23,31 @@ v=$1
./make-tgz-release.sh $v
}
rm -f ../dist/copyparty-sfx.*
rm -f ../dist/copyparty-sfx*
shift
./make-sfx.sh "$@"
f=../dist/copyparty-sfx.py
[ -e $f ] ||
f=../dist/copyparty-sfx-gz.py
f=../dist/copyparty-sfx
[ -e $f.py ] ||
f=../dist/copyparty-sfx-gz
$f.py -h >/dev/null
[ $parallel -gt 1 ] && {
printf '\033[%s' s 2r H "0;1;37;44mbruteforcing sfx size -- press enter to terminate" K u "7m $* " K $'27m\n'
trap "rm -f .sfx-run; printf '\033[%s' s r u" INT TERM EXIT
touch .sfx-run
for ((a=0; a<$parallel; a++)); do
while [ -e .sfx-run ]; do
CSN=sfx$a ./make-sfx.sh re "$@"
mv $f$a.py $f.$(wc -c <$f$a.py | awk '{print$1}').py
done &
done
read
exit
}
$f -h
while true; do
mv $f $f.$(wc -c <$f | awk '{print$1}')
mv $f.py $f.$(wc -c <$f.py | awk '{print$1}').py
./make-sfx.sh re "$@"
done

View File

@@ -213,11 +213,11 @@ def yieldfile(fn):
def hashfile(fn):
h = hashlib.md5()
h = hashlib.sha1()
for block in yieldfile(fn):
h.update(block)
return h.hexdigest()
return h.hexdigest()[:24]
def unpack():