PhilipTheBucket

PhilipTheBucket@piefed.social · 7 months

“And so then Lemmy said ‘Hey you know what would be a good idea is if we copied that model exactly’” :-(

PhilipTheBucket@piefed.social · 7 months

I absolutely have observed mods on Lemmy engaging in the same shitty behaviors (I am sitting on a post about a little cabal of !progressivepolitics@lemmy.world people who seem to be trying to rig the discourse in a particular direction to meet their electoral goals). But the simple fact of it being less centralized and more transparent (and with more of a culture of effective pushback against the mods) makes it a lot harder. They can’t just say “lol get fucked” like the mod from this post did and have that be the end of the story.

PhilipTheBucket@piefed.social · 7 months

Oops, you are 100% correct, I fixed the title.

PhilipTheBucket@piefed.social · 7 months

Edit: Oops, it is /r/Tennessee, not /r/ProgressiveHQ

PhilipTheBucket@piefed.social · 8 months

It is a wonderful project, I got a little rush of nostalgia just from the mention of GW-BASIC.

Make sure you research what the “right way” is to approach some of the difficult problems you will run into; you will learn more that way instead of needing to struggle with figuring out how to pick out where the end of the current loop is or whatever. It will be more enjoyable and manageable I think, and also you will level up your skill.

I do think you should leave the emojis out of the summary, people will think you had ChatGPT write the description lol.

Good luck! It sounds like an endeavor. It will probably not get some wide adoption, as you mention there are plenty of these tools already, but that is not the reason to do it.

PhilipTheBucket@piefed.social · 8 months

Oh, true that, I see now. Yeah, you’re right I think.

PhilipTheBucket@piefed.social · 8 months

Did they? It does say “by Reddit.” Where did you learn this?

PhilipTheBucket@piefed.social · 8 months

Oh… I get it. I looked more. So the issue and the complaint wasn’t just that an anime image of a totally naked cyborg woman was treated as NSFW and put behind a click-to-show thing or something. That would have been fine, I do feel like that’s normal. The issue is that they have some kind of image-alteration gimmick set up which interprets the red cables as blood or gore or something, and modifies the original photo to block out that section permanently. That’s weird, yes. I don’t think the issue is “even vaguely nude,” though.

PhilipTheBucket@piefed.social · 8 months

I mean, in fairness, that’s pretty obviously a nipple.

PhilipTheBucket@piefed.social · 8 months

Sure. I’m saying I tested it against bz2, looked up some rough details of how it works, and got a sense of what the strengths and weaknesses are, and you are wrong that it is simply “the best.” I actually do think it’s plausibly “the best” for applications where speed of compression is paramount and you still need decent compression, which is probably a lot of them. Having learned that, I’ve completed what I wanted to get out of this conversation.

PhilipTheBucket@piefed.social · 8 months

Let me revise that statement to - it’s better in every metric (compression speed, compressed size, feature set, most importantly decompression speed) compared to all other compressors I’m aware of, apart from xz and bz2 and potentially other non-lz compressors in the best compression ratio aspect.

Your Cloudflare post literally says “a new compression algorithm that we have found compresses data 42% faster than Brotli while maintaining almost the same compression levels.” Yes, I get that in some circumstances where compression speed is important, this might be very useful. I don’t see the point in talking further in circles anymore, thank you for the information.

PhilipTheBucket@piefed.social · 8 months

You must be living in a different bubble than me then, because I see zstd used everywhere, from my Linux package manager, my Linux kernel boot image, to my browser getting served zstd content-encoding by default

Clearly a different bubble lol.

What distro are you using that uses zstd? Both kernel images and packages seem like a textbook case where compressed size is more important than speed of compression… which would mean not zstd. And of course I checked, it looks like NixOS uses bz2 for kernel images (which is obviously right to me) and gzip (!) for packages? Maybe? I’m not totally up to speed on it yet, but it sort of looks that way.

I mean I see the benchmarks, zstd looks nice. I checked this:

https://tools.paulcalvano.com/compression-tester/

… on lemmy.world, and it said that lemmy.world wasn’t offering zstd as an option, In its estimate, Brotli is way better than gzip, and sort of equivalent with zstd with zstd often being slightly faster in compression. I get the idea, it sounds cool, but it sort of sounds like some thing that Facebook is pushing that’s of dubious usefulness unless you really have a need for much faster compression (which, to be fair, is a lot of important use cases).

Yeah, I think of bz2 as sort of maximal compression at the cost of slower speed, gzip as the standard if you just want “compression” in general and don’t care that much, and then a little menagerie of higher performance options if you care enough to optimize. The only thing that struck me as weird about what you were saying was claiming it’s better in every metric (instead of it just being a good project that focuses on high speed and okay compression) and a global standard (instead of being something new-ish that is useful in some specific scenarios). And then when I tried both zstd and this other new Facebook thing and they were both worse (on compression) than bz2 which has been around for ages I became a lot more skeptical…

PhilipTheBucket@piefed.social · 8 months

What are you basing this all on?

$ time (cat optimizer.bin | bzip2 > optimizer.bin.bz2)

real	0m4.352s
user	0m4.244s
sys	0m0.135s

$ time (cat optimizer.bin | zstd -19 > optimizer.bin.zst)

real	0m12.786s
user	0m28.457s
sys	0m0.237s

$ ls -lh optimizer.bin*
-rw-r--r-- 1 billy users 76M Oct 20 17:54 optimizer.bin
-rw-r--r-- 1 billy users 56M Oct 20 17:55 optimizer.bin.bz2
-rw-r--r-- 1 billy users 59M Oct 20 17:56 optimizer.bin.zst

$ time (cat stocks-part-2022-08.tar | bzip2 > stocks-part-2022-08.tar.bz2)

real	0m3.845s
user	0m3.788s
sys	0m0.103s

$ time (cat stocks-part-2022-08.tar | zstd -19 > stocks-part-2022-08.zst)

real	0m34.917s
user	1m12.811s
sys	0m0.211s

$ ls -lh stocks-part-2022-08.*
-rw-r--r-- 1 billy users 73M Oct 20 17:57 stocks-part-2022-08.tar
-rw-r--r-- 1 billy users 26M Oct 20 17:58 stocks-part-2022-08.tar.bz2
-rw-r--r-- 1 billy users 27M Oct 20 17:59 stocks-part-2022-08.zst

Are you looking at https://jdlm.info/articles/2017/05/01/compression-pareto-docker-gnuplot.html or something? I would expect Lempel-Ziv to perform phenomenally on genomic data because of how many widely separated repeated sequences the data will have… for that specific domain I could see zstd being a clear winner (super fast obviously and also happens to have the best compression, although check the not-starting-at-0 Y axis to put that in context).

I have literally never heard of someone claiming zstd was the best overall general purpose compression. Where are you getting this?

PhilipTheBucket@piefed.social · 8 months

Yes, Lempel-Ziv is incredibly fast in compression. That’s because it’s a sort of elegant hack from the 1970s that more or less gets lucky in terms of how it can be made to work to compress files. It’s very nice. You said “by almost any metric,” though, not “by compression speed and literally nothing else.” There is a reason web pages default to using gzip instead of zstd for example.

Absolutely no idea what you’re on about with >100 MB. I’ve used bzip2 for all my hard disk backups for about 20 years now, and I think I broke the 100 MB barrier for local storage at some point during that time.

PhilipTheBucket@piefed.social · 8 months

the current state of the art for generic compression by almost any metric

$ ls -lh optimizer*
-rw-r--r-- 1 billy users 76M Oct 19 15:51 optimizer.bin
-rw-r--r-- 1 billy users 56M Oct 19 15:51 optimizer.bin.bz2
-rw-r--r-- 1 billy users 60M Oct 19 15:51 optimizer.bin.zstd

I mean apparently not.

(Lempel-Ziv is not the best compression that’s currently known by a wide margin. It’s very fast and it’s nicely elegant but I would expect almost any modern “next gen compression” to be based on Huffman trees at the very core, or else specialized lossy compression. Maybe I am wrong, I’m not super up to speed on this stuff, but zstd is not state of the art, that much I definitely know.)

Of course this is not better at generic compression because that’s not what it’s for.

They specifically offered csv as an example of a thing it can handle, that’s why I chose that as one of the tests.

PhilipTheBucket@piefed.social · 8 months

I strongly suspect that it’s a bunch of “machine learning” hooey. If your compression is capable at all, it should be able to spend a few bits on categorizing what the “format” type stuff he’s talking about is, and then do pretty much equally well as whatever specialized compressor. I won’t say it will never be useful for some kind of data that has patterns and regularity that are not immediately obvious unless you spell it out for the compressor (2d images where there are similarities between the same positions on consecutive lines widely separated in the bytestream for example), but my guess is that this is a bunch of hype and garbage.

Just out of curiosity, I downloaded it and did the quickstart to test my assumption. Results I got:

$ ls -lh reads*
-rw-r--r-- 1 billy users  27M Oct 19 15:14 reads.csv
-rw-r--r-- 1 billy users 4.2M Oct 19 15:15 reads.csv.bz2
-rw-r--r-- 1 billy users 6.7M Oct 19 15:16 reads.csv.zl

So yeah I think at least at first look, for general-purpose compression it’s trash. IDK. I also tried exactly what it sounds like their use case is, compressing PyTorch models, and it’s kinda cool maybe (and certainly faster than bzip2 for those models) but at best it seems like a one-trick pony.

$ ls -lh optimizer*
-rw-r--r-- 1 billy users  76M Oct 19 15:26 optimizer.bin
-rw-r--r-- 1 billy users  56M Oct 19 15:27 optimizer.bin.bz2
-rw-r--r-- 1 billy users  53M Oct 19 15:26 optimizer.bin.zl

I feel like maybe building Huffman trees based on general-purpose prediction of what comes next, and teaching that how to grasp what the next bits might turn out to be based on what has come before including traversing different formats or even just skipping backwards in the data by specified amounts, might be a better way than whatever this is doing. But doing way worse than bzip2 for simple textual data even when we give it the “format hint” that it’s looking for is a sign of problems to me.

PhilipTheBucket@piefed.social · 8 months

They could literally just have 3 interns make hundreds of fake accounts on various instances, and flood the network with low-grade but ultimately harmless content.

My guess is that it is either that, or else bribing people who have existing positions of trust on the network (admins or moderators or powerusers) to undertake some kind of destructive action.

The first is cheaper and much more effective (and won’t get detected instantly), the second is more in line with reddit-exec-brained thinking. So kind of a toss-up. Something along the lines of those ideas would be my guess though. I thought about some kind of “embrace, extend, extinguish” strategy like Facebook with Threads, but I think even Reddit isn’t stupid enough to try something like that (which would only result in a massive increase in the exodus of users from Reddit to Lemmy).

Edit: Well… something occurred to me. “Low-grade but ultimately harmless content” is exactly what they have been pushing Reddit towards, because they think it is better because it traps people in dopamine loops more effectively. Flooding Lemmy with that stuff (more so than it already is organically) would fuck it up from my point of view, but maybe from their point of view, something like really emotionally toxic random hatred in all directions. Maybe. Anyway, I think flooding Lemmy with content that turns people off via sockpuppet accounts is probably the easiest and most effective way, and I don’t see much way to prevent it.

PhilipTheBucket@piefed.social · 9 months

Like a lot of things, it works best when you can’t really consciously tell that it’s there.

An animation that’s too quick to really register is fulfilling the brief and making the interface better, without cluttering up the user’s conscious awareness. An animation that wants to slow down enough so that you can really feel that the designer put some work into this interface, and appreciate what genius they are, is no good.

PhilipTheBucket@piefed.social · 9 months

This man is a board certified turbo nerd. I very much like for example his succinct explanation of NixOS, with concrete examples of what it makes easy that can be remarkably difficult on other distros sometimes, and how he likes to time his arrival in meetings so that he comes in exactly on the second that the meeting starts (I actually used to do the same with meetings that I was running, setting the clocks if I needed to so that their second hands were accurate.)

Also: “People will use screen sometimes, if they’re very old.” 😃

PhilipTheBucket

After 40 years of adventure games, Ron Gilbert pivots to outrunning Death

/r/Tennessee doesn't want people posting about the Tuesday election