𞋴𝛂𝛋𝛆

  • 110 Posts
  • 1.17K Comments
Joined 3 years ago
cake
Cake day: June 9th, 2023

help-circle

  • I wish it was. ComfyUI is shit. My external firewall and dns logs picked up some dubious shit. Tracking it down, there are parts and pieces in many places.

    I do not know the full scope.

    I do not want to talk about what I have been able to figure out in models because it may have broader implications and I am honestly not sure of all the factors involved yet, like the vae, what exactly is on the second layer that is not in the vocab, and the role of Bert in the transformers package. That is what I am working on in the stable diffusion side. While testing the rewards system, I triggered some background system to package and try to send a sqlite3 database. I am tracking down the components of that system. The processes are unlabeled. The tty is manually created in Python. The agent is this weird distributed model. It is following instructions like an agent where the prompts are in a google package in the Python venv. The actual prompts are in json files. The parts of this system are intermixed with other packages and code. There is also a bunch of functionality that appears to be embedded into the ComfyUI JavaScript. There are also parts of this system that are not activated yet but will check UV hashes. The way the database is sent over the network appears to use the same systemd module for the collective user profile system… The same system that will be doing age verification.

    Much of my searching for packages and names has been done from my home directory. So I was surprised to see the same reporting type database pop up with FreeCAD, and many packages also in flatpak containers. When I see the mechanisms used, it seems stupid obvious how many vectors involved should not be open by default on the host. Like why in the fuck should the kernel default pass no label packets and have access to namespaces outside of any reporting or logs. I was only able to find several components by looking at SELinux contexts. Anyone without SELinux enabled will never see the stuff.

    BTW, why the fucking attitude and disrespect?



  • If I see it again today I will try to reply again but use separate devices for here and ws. I’m air gapped on WS, tracking down the malware that is ComfyUI. See other comment for a few more basics. Don’t trust pip or especially UV. Read the source for everything you have from Python. Look for host OS escalation and obfuscation of stuff like namespaces, processes, and additional tty’s. The dictionaries for Python under collections.abc are hashed for nefarious reasons. That is one way they determine if your stuff is bad think.

    From what I have seen, I want to be on a European Gentoo at this point, maybe even LFS.


  • Looks like AI stuff is also maybe creeping into age/id stuff.

    I’m super concerned because there is a bunch of Python fedora uses throughout.

    FreeCAD also has it now. Rather, has it in the flatpak.

    I am air gapped at the moment tracking down the garbage dump I stupidly failed to verify. As I grep find and locate those packages, I keep seeing problems crossing over into flatpak containers. Things like the default kernel setting passing no label packets, the level of access for host installed Python, noaccount, changing /proc, and allowing a process to escape namespaces is sus to me. This garbage allows Python to create a hidden tty, and hidden connections to TOR. That is straight up malware IMO.

    The hashing of Collections.abc and how UV works is death to open source.



  • Complex social hierarchy is a super important aspect to account for too. In the proprietary software realm, you infer confidence in the accumulated wealth hierarchy. In FOSS the hierarchy is not wealth, but reputation like in academia or the film industry. If some company in Oman makes some really great proprietary app, are you going to build your European startup over top of it? Likewise, if in FOSS someone with no reputation makes some killer app, the first question to ask is whether this is going to anchor or support a stellar reputation. Maybe they are just showing off skills to land a job. If that is the case, they are just like startups that are only looking to get bought up quickly by some bigger fish. We are all conditioned to think in terms of horded wealth as the only form of hierarchy, but that is primitive. If all the wealth was gone, humans are still fundamentally complex social animals, and will always establish a complex hierarchy. This is one of the spaces where it is different.



  • The main problem is when following instructions for command line tools. They might figure out how to use dnf instead of apt, but the extra layers required for ostree are not very friendly. There are a ton of potential frustrations in this area, especially with GPU stuff or hobbyist hardware like Arduino where kernel stuff is needed in userland. At least as of nearly 3 years ago, the documentation in this area sucks. I was on Silverblue for a few years and managed to get through the frustrations due to intermediate experience level. I found toolbox useless compared to distrobox. But using this with something like Arduino was annoying at best. The needed dependencies expected by whatever stuff I wanted to install was usually a big mystery with near useless error failure messages and names of packages and libraries totally unrelated to the package naming in DNF. When updating the base OS, stuff built in these containers is totally useless because I could not update the containers to the new OS image. Playing around with Flash Forth on a microcontroller was even worse. I ended up layering a bunch of stuff on the host because the containers were just not working. When I got an Nvidia machine, I went to Fedora Workstation and have had far fewer issues and frustrations. SB wasn’t bad, but it is a pain to use these if you need kernel level access. Just my $0.02. I was actually on SB for ~2-3 years.




  • Depends on the system. Typically, the older systems do not work like this. The GPS satellites only transmit a signal that contains their location information and the time. The device must collect several of these signals and then use trigonometry to calculate your real location in time and position. Yes there are relativistic effects due to the distance to the satellites and gravity.

    For instance, in home lab electrical engineering, if a person wants a really good reference clock but cannot afford a cesium atomic reference, they can use a relatively cheap GPS system to build a referenced oscillator that is disciplined by the reference clock on these satellites. I think they are cesium too, but it has been awhile since Dave Jones made YT uploads on the eevblog about it. A Garmin bicycle computer is another example. It is triangulating the signals and plotting periodic waypoints with some basic averaging.

    That said, WiFi routers and cellular towers are possible to use for similar triangulation. Maybe check out Hak5 if they are still around. It has been awhile since I looked them up, but they used to make pen testing red team stuff that will infer much about vulnerabilities.







  • llama.cpp is at the core of almost all offline, open weights models. The server it creates is Open AI API compatible. Oobabooga Textgen WebUI is more user GUI oriented but based on llama.cpp. Oobabooga has the setup for loading models with a split workload between the CPU and GPU which makes larger gguf quantized models possible to run. Llama.cpp, has this feature, Oobabooga implements it. The model loading settings and softmax sampling settings take some trial and error to dial in well. It helps if you have a way of monitoring GPU memory usage in real time. Like I use a script that appends my terminal window title bar with GPU memory usage until inference time.

    Ollama is another common project people use for offline open weights models, and it also runs on top of llama.cpp. It is a lot easier to get started in some instances and several projects use Ollama as a baseline for “Hello World!” type stuff. It has pretty good model loading and softmax settings without any fuss, but it does this at the expense of only running on GPU or CPU but never both in a split workload. This may seem great at first, but if you never experience running much larger quantized models in the 30B-140B range, you are unlikely to have success or a positive experience overall. The much smaller models in the 4B-14B range are all that are likely to run fast enough on your hardware AND completely load in your GPU memory if you only have 8GB-24GB. Most of the newer models are actually Mixture of Experts architectures. This means it is like loading ~7 models initially, but then only inferencing two of them at any one time. All you need is the system memory or the Deepspeed package (uses disk drive for excess space required) to load these larger models. Larger quantized models are much much smarter and more capable. You also need llama.cpp if you want to use function calling for agentic behaviors. Look into the agentic API and pull history in this area of llama.cpp before selecting what models to test in depth.

    Huggingface is the goto website for sharing and sourcing models. That is heavily integrated with GitHub, so it is probably as toxic long term, but I do not know of a real FOSS alternative for that one. Hosting models is massive I/O for a server.



  • The easiest way I know of to check any machine is to put another router or machine in front of it with a white list firewall or way of logging DNS traffic. You just need to spot the address in the list.

    DNS filtering usually only filters on incoming packets, but for bot stuff that should catch issues.

    In general, most routers run everything from a serial flash chip on the board. These are usually 8, 16, or 32 megabytes. They have a simple bootloader like U-Boot. This is what loads the operating system. These devices have a UART serial port on the PCB. You can use a USB to serial UART adaptor to see what is happening in the device. With a proprietary OS, you are still likely to see the pre-init boot sequence that the bootloader prints to terminal. Most operating systems also print information to this interface, at least of the couple dozen junk devices I have been given and messed around with. I make a little mount for a USB to serial adaptor and add it to all of my routers when new, so I only need to plug in USB to get to the internal bootloader and tty terminal interface of OpenWRT. You will need to know the default baud rate of the device, although it is probably listed somewhere online or can be guessed as one of the common high values at or above 9600.

    Getting into this further gets complicated. It is probably better to look for any CVE that is relevant to the device or software and work backwards. Look for any software updates that have obfuscated the risk for each CVE. If the issue was not fixed, that is where to look to see if someone has exploited the device. Ultimately, they need clock cycles from the CPU scheduler. So it must be a process or some way of executing code from unregistered memory.

    This is getting to the edge of what I have messed around with and understand. There may be a way to get a memory map that includes unused pages, and compare that with a hex dump of the flash memory. This is outside of your scope of a proprietary OS, but hopefully frames the abstract scope of what is possible on this class of device when you have an open source stack. The main advantage of this kind of device and issue is that you can physically remove the flash chip and then see and manipulate every page and memory location. The device likely doesn’t have microcode loaded into the CPU(s) that make it challenging to determine what is going on.

    There is probably an easier way, but a hex dump of the current system can be hashed against the factory updated version to see if any differences are present. It is likely that any exploit will include a string with the address to connect to somewhere in flash memory. It could be obfuscated through encryption or a cypher, but a simple check for strings in the hex dump and a grep for “http” is a simple way to looks for issues.

    The OpenWRT forum is a good general source. The people behind the bootloaders for these devices are also Linux kernel developers and on the OpenWRT forum.