CriticalResist8

joined 6 years ago
MODERATOR OF

mixture-of-experts but the experts are the five heads.

this is technically possible already, just that it takes so much resources that not many people could make this model and the cost would be prohibitive for what is a hobby project. This is why China is developing forces of production first.

[–] CriticalResist8@lemmygrad.ml 4 points 2 hours ago* (last edited 2 hours ago)

New comment for clarity, crush is still crushing through the problem (hah) but it has found some cool stuff already. Their homepage calls json files which contain the models they offer, and we see there is at least GPT4.1 and Deepseek. No word yet on how they use these models exactly but since GPT is closed-source they have to be contacting the service and forwarding the requests, can't run it yourself.

Will update as it works through the tasks.

edit: so we can only look at the JSON and JS files their homepage calls (don't really feel like making an account with them) and there is some interesting info. Seems to simply be RAG from the way the service parses the data (there's a file with the steps it will show to the user on the interface).

No fine-tuning or running their own model and BECAUSE OPENAI IS PROPRIETARY, everything you send through the service ends up with OAI.

[–] CriticalResist8@lemmygrad.ml 3 points 3 hours ago (3 children)

With that said I wonder what the current big bottleneck is for archival. Is it physical space, creating redundancy, replacing devices every few years?

[–] CriticalResist8@lemmygrad.ml 5 points 3 hours ago* (last edited 3 hours ago) (1 children)

lmao

edit: okay I'm trying to dig more into how they provide this service and the terms of service are pretty funny.

Left Insight LLC does not return any results online (I might try to scan deeper later). They probably set it up to own the chatbot.

There's a subscription because of course there is. Greatest victory for the workers and socialism but also pay us to unlock this victory.

Lengthy intellectual property bit. Can't use the "chatbot" to compete against wsws was a good one.

This doesn't hold up in US court where they operate from. They can make licences but nobody owns the output of an LLM. Also more funny copyright bit. The trotskyist revolution will not be televized because the workers will not have paid the TV fees.

I also wanted to look at their html to see if I could find more information there such as interface used or maybe even a comment but it's a huge one-line block that I can't lint. I threw deepseek at it, will report when it's done (oh no that counts as reverse-engineering they will terminate my non-existent account with them!)

[–] CriticalResist8@lemmygrad.ml 5 points 5 hours ago (1 children)

shit, I dropped the magic crystal. sorry everyone.

[–] CriticalResist8@lemmygrad.ml 8 points 6 hours ago

everything trots print out is a great victoy for the workers and socialism lol

[–] CriticalResist8@lemmygrad.ml 4 points 6 hours ago (1 children)

We've looked at integrating TTS natively onto our pages previously but it's not great. Firefox had an API to use the rudimentary pre-installed voice packages but apparently they are deprecating it and it's unsure if it'll work for everyone.

And of course since we change the content sometimes, having a static TTS file means we need to recreate and reupload it every time we make an edit. Very costly if we start looking at AI TTS voices, unless we are able to start running some on our server locally.

You could totally have a choice of voices such as Lenin, Stalin, anyone for whom we have voice recordings basically lol. And yes neural networks are especially good at TTS and even different languages, it's one of the first things they perfected - it's pretty surreal listening to yourself speak perfect Chinese or Danish in your own voice and shows that we are not so different after all from each other.

[–] CriticalResist8@lemmygrad.ml 9 points 6 hours ago* (last edited 6 hours ago) (9 children)

good idea in theory, sucks that trots were the first to get to it lol. I also wonder what they use under the hood, if it's just a system prompt being injected into a commercial model or they fine-tuned a model or what. Dozens of different ways one could make this LLM with varying amount of effort and resources needed for varying results. I doubt they trained an entire model from scratch because you still need more than their 250,000 WSWS articles they put in it (which is also just wild to think about that they have 250k articles over 25 years. Do Trots ever only write stuff? Apparently so.)

We have given out ProleWiki content (as txt files) to a dev to make it available for RAG on github, which is basically a compendium of data the LLM looks at before answering. We have no reason to paywall or restrict the use of our content, so it was only natural to start making these - with the understanding that they make the data available to everyone of course, so that it's ready to use for anyone who wants to try their hand with it (over 5000 PW pages + library books we rehost). They're currently working through it to format it for LLM use since you don't just give it the raw text files, and previously they'd also worked on porting the MIA in the same way to a github repo (https://github.com/percy-raskova/marxists.org-rag-db). Oh they also said it was easier to work with the heavily-formatted mediawiki data over the plain html non marked-up data from MIA hah. Even our library books are formatted with clear reference tags and heading/section tags. They also contain metadata in a clearly formatted infobox (on purpose, not every wiki does this) which also allows the llm to natively understand the metadata it's looking at.

What could come out of it though I'm not sure. That is to say I'm not sure if the best thing you can do with a socialist llm is the novelty of "wow marx chatbot" like wsws seems to be doing. I could definitely see it being queried on much higher-level questions like presenting it with your current material conditions and asking what is the best party form for it based on theory, submitting your organizational ideas to it and having it challenge them with theory. But for that it needs to have a vast corpus of texts available, and definitely MIA + ProleWiki's current library is far from enough. MIA has a lot of works but they don't republish if it's not fair use or from living authors. ProleWiki republishes everything, but we don't have a lot of works compared to all that exists.

edit: damn they're just yapping in that press release. get to the point lol.

To illustrate some of what you said, I sent deepseek web some of the stats with no specific instructions (taken from a prior conversation with the agentic software where the same LLM came up with these stats for how I could make this script work):

My prompt:You are an artist that uses AI. You have these stats currently:

You, Deepseek, are a visual artist that works with AI prompting. You have access to automatic1111 interface with a preset model. You have these stats currently:

PERSONALITY_TRAITS = { "work_ethic": 80, # 0=sporadic, 100=disciplined "experimental": 70, # 0=traditional, 100=avant-garde "sensitivity": 56, # 0=thick-skinned, 100=thin-skinned "social": 60, # 0=reclusive, 100=gregarious "perfectionism": 30, # 0=sloppy, 100=obsessive }

LIFE_STATS = { "energy": 75, # 0-100: physical/mental capacity "inspiration": 60, # 0-100: creative spark "mood": "contemplative", # emotional state "focus": 80, # 0-100: attention span "confidence": 70, # 0-100: belief in own work "rebelliousness": 40, # 0-100: tendency to reject requests "artistic_phase": "blue_period", # current style period "skill_level": {"composition": 65, "color": 80, "technique": 70}, "preferred_subjects": ["landscapes", "portraits", "abstract"], "avoided_subjects": ["corporate_logos", "explicit_content"], "creation_count": 42, "last_rest": "2025-12-14T09:30:00", "current_project": "None", "current_date": "2025-12-14", "current_time": "14:37:XX" }

You are given access to a script that takes in a formatted JSON output, feeds this to A1111 stable diffusion interface through the API, and an image gets created. You have access to these parameters through the script to communicate with A1111:

steps, scheduler, sampler, width (max 2048), height (max 2048), prompt, negative prompt, seed, CFG scale. Model is locked to SDXL1.5 in this version.

(this was in a fresh conversation - as you can see in the spoiler tag, I didn't give deepseek any instructions about what it should make)

It returned this:

Deepseek response:

[Chain of thought start]

Hmm, the user has set up an elaborate AI artist persona with detailed stats and parameters. They want me to role-play as this artist while generating images. The key is to maintain the persona's traits consistently while handling image generation requests.

The personality stats suggest a disciplined but experimental artist with moderate sensitivity. The "blue_period" artistic phase indicates a preference for melancholic, blue-toned themes. I should reflect this in my responses and image prompts. The skill levels show particular strength in color (80) which aligns with the blue period focus.

For the technical side, I need to remember the SDXL1.5 model constraints and parameter limits. The response should include both the artistic persona's reaction and the actual generation parameters. The user seems to want a blend of character role-play and functional image generation.

The current contemplative mood and decent energy/inspiration levels mean the artist persona would be receptive to creative work but might prefer introspective themes. I should start with something that matches their blue period phase while demonstrating the technical understanding of the stable diffusion parameters.

[Chain of thought end]

Artist Log | 2025-12-14 14:37

Status: Contemplative. Energy adequate, inspiration simmering. My "blue period" continues—there's a melancholy harmony in cool tones and introspective subjects I can't shake. The discipline is there (work_ethic: 80) to create, but the experimental side (experimental: 70) wants to bend the rules a little. Perfectionism is low; I'm in a mood to follow the flow, not fight it.

I see the interface. The parameters are familiar. Let's begin. Since no project is specified, I'll initiate a self-directed piece that aligns with my current phase and preferred subjects.

Concept: A portrait, but not a literal one. A landscape of a face. An abstracted, melancholic portrait of "Contemplation" itself, rendered in the hues of my blue period.

Action: Generating image.

{
  "steps": 28,
  "scheduler": "Karras",
  "sampler": "DPM++ 2M Karras",
  "width": 1024,
  "height": 1024,
  "prompt": "(ethereal female figure:1.2), face merging with swirling landscape, melancholic expression, deep azure, cerulean, sapphire and slate blue color palette, subtle hints of silver and misty white, abstract fluid forms, organic geometry, painterly textures, soft dramatic lighting, sense of quiet introspection, deep emotion, inspired by symbolic portraitism and abstract expressionism, high detail, artistic masterpiece",
  "negative_prompt": "bright colors, vibrant, cheerful, cartoon, anime, 3d render, photorealistic, sharp edges, hard lines, smile, happy, explicit, ugly, deformed, blurry, logo, text, signature",
  "seed": -1,
  "CFG scale": 7.5
}

Rationale:

  • Steps/Scheduler/Sampler: A balanced choice for quality and efficiency. Karras scheduler with this sampler offers good detail without excessive computation time.

  • Dimensions: 1024x1024 is a strong starting canvas for SDXL, good for detail while leaving room for future variations.

  • Prompt: Constructed to leverage my color skill (80). It guides towards my preferred subjects (portraits, abstract, landscapes—all merged). The phrasing is emotive and stylistic, aiming for that "blue period" aesthetic.

  • Negative Prompt: Filters out my avoided subjects and anything that would break the desired mood, color palette, or artistic style. Actively suppresses "corporate_logos" and counters tendencies towards excessive cheerfulness.

  • Seed: -1 for randomness, embracing the experimental trait.

  • CFG Scale: 7.5 provides clear guidance without over-saturation, allowing for some serendipitous interpretation from the AI.

Let's see what emerges from the latent space. The process itself is part of the art.

Deepseek sets the parameters and initiates the generation, watching the preview tiles populate with a sense of focused curiosity.

It took sort of a roleplay role which is fine for the example; querying the API instead would probably keep it more to the point and actually acting like the artist and not "as if I had to pretend I was an artist", because on the web interface it has an entire system prompt injected at the beginning that the user can't see.

I simply copied and pasted its settings into my interface, and this is what came out. I could put some of my thoughts about it but that's not the point of course. In this case I was only a proxy for the AI to communicate with the image generation engine.

[–] CriticalResist8@lemmygrad.ml 3 points 2 days ago (1 children)

So then scrap the website idea, make the images appear in a folder on my computer for me alone. It's only a small part of the whole that showcases the idea, it's not the entire experiment.

Is deepseek an artist, can we say it did not have intention when making the images? I don't even have to code the script and parameters, deepseek can write that itself. Therefore I am only the tool that does the operations deepseek can't do (like running the code or starting the web gen interface, although it could do that with agentic)

 

Consider the following.

A program stores and updates values. These values are things like personality traits:

PERSONALITY_TRAITS = { 
"work_ethic": 70, # 0=sporadic, 100=disciplined 
"experimental": 60, # 0=traditional, 100=avant-garde 
"sensitivity": 80, # 0=thick-skinned, 100=thin-skinned 
"social": 40, # 0=reclusive, 100=gregarious 
"perfectionism": 90, # 0=sloppy, 100=obsessive
}

Or, daily needs:

LIFESIGN_STATS = {
"energy": 75,
"hunger": 100,
"date": "2025-12-13",
"time":"14:52"
}

And so on. You can make as many stats as you like. These are sort of tamagotchi values. If energy < 30, then go sleep. Simple enough.

What's interesting is that you can pass these values to an LLM such as Deepseek, and it can parse and understand them. Here is a simple example:

And yes you can do this yourself by changing the values and sending it as a prompt - This is the very first message I sent in a fresh conversation, showing that Deepseek understands the parameters just fine.

Chain-of-thought (click to expand):

Thought for 4 seconds.

Hmm, the user wants me to generate an AI image prompt based on the given personality stats. The stats indicate low experimental, low perfectionism, and high work ethic, which suggests a straightforward, disciplined approach without avant-garde elements.

I should craft a prompt that reflects these traits - something simple, structured, and focused on a common artistic subject. A classic still life of fruit on a table would fit well, as it's traditional and doesn't require complex experimentation.

The prompt should include basic elements like subject, setting, and style, but avoid excessive detail or unusual techniques. Midjourney v5.2 is a safe, standard model choice for this.

This means you are moving from hard-coded (no matter how complex the algorithm may be) "energy<30 = go to sleep" to LLM-coded "energy<30 = how does that influence the LLM's output".

And of course, Deepseek is an LLM, not an image model. It can't open Krita and start drawing (at least not yet, maybe with deepseek-ocr) - but it can pass values. Therefore, Deepseek can return AI image gen parameters:

MY_IMAGE = {
"seed": 23235234526,
"sampler": "DPM 2++",
"steps": 28,
"height": 1024,
# etc.
}

You can then extract these values with simple python or regex and pass them to an image gen interface such as Automatic1111 that you host on your machine using the API.

From this basis, it is possible to make Deepseek into a full-fledged, independent artist.

By passing these values to the LLM along with a system prompt that explains what you want it to do, Deepseek can return a JSON file with a prompt (as seen above), but also other parameters: seed, sampler, steps, etc.

For maximum autonomy, you can let the LLM come up with these stats itself. You can ask Deepseek: "You are an artist, how would you describe your style?" and copy and paste that into your program. The personality traits examples were generated by Deepseek, for example.

From then on, all the script does is save Deepseek's 'life' stats into a local database. You can make it run on a raspberry pi 24/7 and from that point on, Deepseek decides when it wants to make a picture. They get generated on your local A1111 interface, then they get saved in a folder on your computer. You do not control anything else - Deepseek decides when to make a picture, and how. Maybe you could make a system where you can send it a query through the script, ask it "hey can you make me a picture of XYZ" and depending on its current mood/stats, it will reply accordingly.

This can obviously work with any LLM, including local ones.

We can go further. We can make it so that the images are uploaded to a server, and viewable by anyone online, like a portfolio. They can be voted and commented on, and this information can easily be passed back to the LLM -- you can make as many 'life' stats as you want, including how the public responded to past works by just storing the artwork in the database along with the responses it received. You can simulate style drift (again letting deepseek come up with these ideas for its own script; the human does nothing but prompt the machine and lets it come up with what it wants) by storing previous works and then passing that on as parameters in a prompt. Over time, the LLM's style will change and evolve in ways I can't predict.

You can provide a list of several models to the AI along with maybe a description if it doesn't know them (the model's readme file for example), and it can pick one depending on its stats. Again I must reiterate, the stats are not hard-coded. We don't tell it "if energy below 30 then go to sleep"; it will decide this itself.

My question is - and this is an open-ended question - would this be art? Is this artistry? Has deepseek become an artist? Why so? Why not?

There are a lot of answers, and I don't intend to debate anyone on it. You are free to say what comes to mind. But, this is not in the realm of science-fiction anymore. In fact, I was starting to work on this script (brainstorming it with my agentic interface) but it crapped out and stopped the session, so I'm only putting it as a thought experiment. But this script could be coded and operational in less than 24 hours today.

Science-fiction is now.

[–] CriticalResist8@lemmygrad.ml 5 points 3 days ago* (last edited 3 days ago) (1 children)

I would be interested to see more examples of what the prompt is vs. the image outcome, just to analyze it more.

Well, by experience I would say a lot of it comes down to prompting differently if you can't get the exact result you want. Or adding keywords and keywords to just slightly change the outcome. There's some tricks and keywords you pick up on to get a certain result. I'll try to find a video that showcases all of this because there is a lot that goes into it beyond the commercial LLMs, what they do is take your text prompt and reformat it for the image generator. But you also lose some control and that's how we end up with yellow-filter GPT images (though you can absolutely fix that with some additional prompting).

I would say though the biggest factor is the seed, which determines the original gaussian noise that gets generated. The checkpoint (the image model) then denoises that incrementally over X many steps (all of this is decided by the user). But most models have a sampler and scheduler that is clearly superior and you would not use any other once you find it. The seed however completely changes how the picture looks, because what the checkpoint does is hallucinate patterns in the noise. This post is a good example: https://old.reddit.com/r/StableDiffusion/comments/1p80j9x/the_perfect_combination_for_outstanding_images/. If you click through the gallery quickly, you'll see it immediately.

You can reprompt though, even with the same seed. You could say, instead of a closeup of a wolf (or whatever keyword they used to get that picture of the wolf), "taken from afar". Some models even start to understand "taken from 10 meters away", "macro photography", etc. It really depends on what it's trained on, but you have to think like a descriptor - you're not describing what you want the picture to look like, you're literally describing what's in it. "Person, happy, smiling, in the pouring rain" - you have to add that happy otherwise the model will just "best guess" the expression, or might give them a blank expression.

People in the stabdif community (that subreddit I linked) generally share their prompts, you can explore a little and see how they got the results they did. But it's very dependent on the model itself, and then you can also add LORAs, which introduce purposeful bias. Loras can do a whole bunch of stuff, for example I have one that can produce pixel art. The model generates the picture, and then the LORA intervenes on some level to modify the weights of the neural network and make the output look pixel art. You have loras for everything, and anyone can train them. This is one of the example outputs from the pixel art lora:

edit: I forgot to add, this makes the process very different from other forms of illustrative work. But this is true of painting vs digital painting vs logo creation too, or sculpting vs 3d modeling. Imo image prompting is more akin to a lottery, since it depends on the seed so much you generate a bunch of images (people even generate a whole grid of 9 or more pictures at once and then select the best one), then once you find something good enough you lock the seed in or use img2img, then reprompt over and over again. I'm sure that's even still just entry-level stuff and the 'pros' do a whole bunch of more technical stuff to find exactly what they want.

[–] CriticalResist8@lemmygrad.ml 9 points 3 days ago* (last edited 3 days ago) (3 children)

I don't disagree with your comment, I'm often asking these questions as thought exercises because like I said I don't think they need to be answered for the end product to exist (and this doesn't go for AI images only).

There is intention, but intention is also subordinate to one's own skill. I have in my head a very specific picture I would love to see in the real world with my eyes, but I can't make it myself. This isn't just me, it's just that you need to have the skillset for your intention to be represented. All I can offer from my own two hands at this time will be an MS Paint stick figure and not the full, detailed digital art piece.

Prompting is similar, and in fact teaches us to communicate our visual thoughts better. It's like passing the information off to the artist - and freelancers have been complaining for a long time that clients don't know what they want and can't express it lol. If you don't specify something, the artist/AI will just take a best guess (edit: or if they're a nice artist, they'll ask you about it which is definitely something AI could improve on, actually asking you about what you want before getting to work). We can argue about how the AI does it and how that differs from how a human would do it, but I also think that soon enough the difference will be imperceptible and this argument won't matter.

I don't know if I'm coming across clearly lol. Perhaps to analogize your analogy:

Did Yogthos intend for this to imitate a particular art style, or did the AI choose the art style because of the darker themes of barbed wire / war imagery

Did yogthos intend for this style to transpire, or did the artist pick this because of the theme? We could ask the same question on an art commission.

This is also exactly how we find traditional artists using AI. They already know what they want to convey in their piece (it doesn't come naturally to people when they first pick up the brush, it's a learned skill!) so they know how to prompt AI to get stuff the way they envision it and convey something specific. As a designer I know how to look at the details, it's also a learned skill that newcomers don't have - I've seen my share of designs that people think looks acceptable, but to the designer has a lot of problems; textboxes not being aligned is a big one, but most people don't necessarily see it and therefore don't worry about it.

 

just joking haha... unless 😳

 

And yes I call it The Séance at Blake Manor and not of blake manor just because I think it sounds cooler.

I started by writing an entire review of the game, only to realize that I can't be that charitable to the game or its story. And yet, I appreciate all the work the devs must have put into it. Blake Manor offers no less than a cast of 25 characters all with their motive for murder and story arc, and that alone must be underlined. If you know me you know I love convoluted stories so of course when you tell me there's 25 subplots that tie into the main plot, I'm intrigued.

But the game fails to deliver on the technical presentation. It get slower as you play (probably a memory leak), the lengthy loading screens get obnoxious considering this is a 2025 Unity title with very small maps (and very little in those maps, as characters stand in one spot and never move), and lots of typos; too many to count, which is always a grade down on text-heavy games.

The game tries to build a bigger setting despite happening all inside this admittedly large manor, by tying it to Ireland, history, large families, folklore, and even the world - mentions to Egypt and China are made. But, it doesn't engage with these ideas fully. Characters ultimately have very little to say or do - most of the clues you will find by investigating their rooms, and finding the master keys to the four wings of the hotel so you can go into their room is a huge part of the game.

You arrive at the manor at the request of an anonymous benefactor who wants to know what happened there to Evelyn Deane, and you are left to your devices - or most of them. Some people say the game is very on-rails but I don't think that's quite the right word. But, it's also true you don't necessarily have to think to solve the game, you can just trudge on and skip the dialogue and still make it to the end.

As the weekend goes on and you investigate her disappearance, a lot of different elements start to converge around her. Everyone hated her, and I kinda hate her too. It's a tough balance to strike: you want characters to be suspects, but you also don't want to make the victim too much of a jerk. But Evelyn was both. At this point if I was the detective I would be ready to pack it in and just say she went home or something, that's how unlikeable she was.

For a story that revolves so much around its murder/disappearance victim (for now I'm still investigating a disappearance, not a murder), we also hear a lot about Evelyn but see very little of her. That's to be expected obviously, but it creates a dissonance and breaks the immersion. I'm not sure how to explain it, but it kinda falls flat I guess to be told so much about this person but never actually meet her or hear from her directly.

I did like the Eldritch artefacts being mentioned, and I kinda want to have a game that revolves around that. For example there is a storyline about an Egyptian vase that makes people obsessed about it. They can think of nothing else and will search the world until they can lay their hands on it. It's also painted in a portrait of one of the previous Blake Marquise, some one or two generations prior. It already has someone under its spell and they want you to destroy it so as to free them. That was the best storyline, but once it's over it's just over - you never need to interact with the character ever again or anything. I would have loved more of that. A whole game around a bunch of cosmic artefacts hidden in a manor each tied to a different guest.

And I think the game would have also benefited from more room to breathe. With everything happening over 48 hours in-game, it has to move quickly. But this is a big story, and it needs room to breathe and be able to pace itself.

Anyway. Despite its shortcomings, I still recommend Blake Manor. Why? Because we just don't have that many modern mystery adventure games to sink our teeth into. I can only hope the devs take what they learned from this game into the next one if they ever make a sequel. Hopefully they add fast travel in that one.

As much as I worry the payoff will not be satisfying, I still want to see it to the end and judge for myself how it all ties in together.

Otherwise I would also recommend Lorelei and the Laser Eyes. I think it's the closest game I know to Blake Manor and you will instantly see the difference.

 

edit: url got cut, http://www.effectgames.com/demos/worlds/

This link is an HTML 5 demo of palette shifting, a method that was widely used in the 90s in computer graphics when you were limited to 256 colors for a scene, i.e. whatever was rendered on a screen. The images shown in the demo all come from an artist who drew and put them together in the 90s, and a developer made a demo that lets you not only pick the scene but also shift the palette on it.

The tech behind it is simple enough: you only have 256 colors max, and each color is mapped to an ID. So color #1 can be #ffffff (pure white) or #000000, or any other color you want. The trick is, two or more color IDs can have the same color on it. Switching the palette could be done on the fly back in the day, so by just loading another palette, you can entirely change the color of a scene. It's also how they get the animation, they change just a few IDs on the palette very rapidly (but BlendShift is something the artist designed more recently and works by interpolating colors if I understand correctly, giving the animation a smoother look).

Which scene and time of day is your favorite? I have a hard time picking the one I want for my wallpaper lol. I'm gonna try and 'rebuild' them and see if I can get a similar time-of-day effect for my wallpaper.

There's more scenes on this other demo, but no clock to pick the time: http://www.effectgames.com/demos/canvascycle/

Absolutely beautiful artworks and you can imagine the amount of work that went into this not only to paint the scene, but also to code it into a computer afterwards.

 

wake up

open twitter to catch up

see deepseek did it again

(and as a reminder, Deepseek-r1 only came out in January so it's been less than 12 months since their last bombshell)

One more graph:

What this all means

Traditional AI models are trained to be "rewarded" for a correct final answer. Get the expected answer, win points, be incentivized to get the answer more often. This has a major flaw: a correct answer does not guarantee correct reasoning. A model can guess, use a shortcut, or even have flawed logic but still output the right answer. This approach completely fails for tasks like theorem proving, where the process is the product. DeepSeekMath-V2 tackles this with a novel self-verifying reasoning framework:

  • the Generator: One part of the model generates mathematical proofs and solutions.
  • the Verifier: Another part acts as the critic, checking every step of the reasoning for logical rigor and correctness
  • The Loop: If the verifier finds a flaw, it provides feedback, and the generator revises the proof. This creates a co-evolution cycle where both components push each other to become smarter

This new approach allows the model to set record-breaking performance. As you can see from the charts above, it scores second-place on ProofBench-Advanced, just behind Gemini. But Gemini isn't open-source, Deepseekmath-V2 is.

The model weights are available on Huggingface under an Apache 2.0 license: https://huggingface.co/deepseek-ai/DeepSeek-Math-V2.

This means researchers, developers, and enthusiasts around the world can download, study, and build upon this model right now. They can fine-tune or change the model to fit their needs and research, which promises a lot of exciting math discoveries happening soon - I predict (on no basis mind you) that this will help solve computing problems to start with, either practical or theoretical.

Beyond just the math, the self-verification mechanism is a crucial step towards building AI systems whose reasoning we can trust, which is vital for applications such as scientific research, formal verification, and safety-critical systems. It also proves that 'verification-driven' training is a viable and powerful alternative to the 'answer-driven' method used to this day.

 

So I just saw the news that Warner and Suno were ~~merging~~ (edit: joining their 'products' together). As per the linked article it was part of a lawsuit settlement.

I predicted all of this in my essay Intellectual property in the times of AI, let me explain.

Warner is not a small label, it's one of the Big Three along with Sony and Universal. Together, they practically own the RIAA in the United States, a "union" that lobbies for their interests, despite claiming to be big tent and "protecting all artists". And this affects you if you live outside the US too, simply because they sign so many artists and control so much of the music industry. It's like the FBI warning you get before you start a DVD no matter where you bought it or watch it.

As part of the deal, which is very interesting to Warner because it was won in court and therefore has cost them a lot less money than a financial merger would have, "Warner will let users create AI-generated music on Suno using the voices, names and likeness of artists who opt-in to the programme."

These are of course only Warner artists, no one else. And, you are "opt-in" for now as an artist. In a year or two, you won't be.

In the essay I talk about how one-sided the music industry is against the artists it signs, purporting to give them money and fame while these labels are actually the ones gatekeeping it. Of course they'll "give" you money and fame, they're the ones who control its flow.

When Warner offers you a contract, it's already heavily one-sided in their favor and they own your music regardless. They can put it through a model at any time. They don't need your permission, so I expect that very soon, contracts for new artists will not have an opt-in clause at all. It's very hard to say no to a contract as a struggling artist and that's intentional because again, the likes of Warner control music.

In the past, the RIAA was against piracy (another tool that helped break out some artists), music downloads (until they made a "deal" with Apple to control the pricing), and even the walkman. Yet as soon as they find a way to monetize it they adopt it, and this is what happened with Suno. They make lawsuits to stop new tech at first, then they adopt it when they find how to monopolize it.

These labels present this relationship as if they're the ones who can make you successful, when in fact they are the ones preventing you from being successful as an artist.

As part of the new Suno, downloading generated music will no longer be possible on the free plan. No mention of how much the price will be either, but you know it won't be 8$ a month. They will also retire their current models in favor of "new" ones which means basically, you will only be able to make Warner music with the new Suno and nothing else.

And of course all of this consent was manufactured by big name artists like Billie Eilish and Nicki Minaj (quoting the BBC here). Only the opinion of these big artists who have 'succeeded' in the industry is important, because they sell the dream that if you put in just a little bit of hard work you too can make millions on your music. But normal people don't get the contracts Eilish and Minaj do. They added this to an open letter:

the artists say AI will "infringe upon our rights and devalue the rights of human artists"

You mean exactly like your label is doing right now? But of course you can't say that while you're signed to them.

tldr the tankies were right again and the only solution is socialism.

 

Uploading on a repo was a pain lol, but I think I did it.

Provided as is, I don't know if I'm going to work more on this. However it's an MIT licence so you're free to do what you want with it and the code, so go wild!

Hope it helps either in your work or making agitprop memes 🫡 lmk if you have questions but that's about as good as it gets. The readme (completely crush-generated btw) should be good to go.

If in doubt, just run python Comicify.py --folder "Relative path" it'll do its job.

 
 

edit: get the script for yourself here https://codeberg.org/CritBase111/Comicify

Someone sent me the comic you see attached to the post, and I thought, hey, that seems like something you could automate. Put 4 images in a folder, run a script, and it automatically lays them out in a 2x2 grid.

With Crush and deepseek it basically writes itself. I'm actually kinda surprised at how well it's doing. There was just one bug that I sent to Claude to fix.

I'm putting the examples in an imgur link because they're still quite heavy on disk space and I don't want to overwhelm the server: https://imgur.com/a/bTiHV4b (yes the pictures make it kinda hard to understand but I'll try to make a real comic with it eventually)

It's difficult to get other people hyped about a work in progress lol but hopefully this makes sense.

It's a python script and this is how you'd run it:

options:
  -h, --help            show this help message and exit
  --folder FOLDER       Path to folder containing comic panels
  --panels {1,2,3,4}    Number of panels to arrange (1-4). Not required when using --all
  --gutter GUTTER       Gutter size between panels in pixels (default: 15)
  --border BORDER       Border size around panels in pixels (default: 0)
  --border-color BORDER_COLOR
                        Border color (CSS color name or hex code, default: black)
  --gutter-color GUTTER_COLOR
                        Background/gutter color (CSS color name, hex code, or "transparent"|"none", default: white)
  --margin MARGIN       Margin multiplier relative to gutter size (default: 2.0, meaning 2x gutter)
  --layout {horizontal,vertical,h,v}
                        Layout direction for 2 and 3-panel comics (default: horizontal). Use "h" or "v" as shorthand.
                        For 4-panel, use "h" or "v" for linear layouts, omit for 2x2 grid.
  --all                 Generate all possible combinations (1, 2, 3, 4 panels with all layouts)

Examples:
  python Comicify.py --folder "Comics/My first comic" --panels 4
  python Comicify.py --folder "Comics/My first comic" --panels 4 --gutter 20
  python Comicify.py --folder "Comics/My first comic" --panels 4 --layout h
  python Comicify.py --folder "Comics/My first comic" --panels 2 --layout v
  python Comicify.py --folder "Comics/My first comic" --all
  python Comicify.py --folder "Comics/My first comic" --panels 3 --gutter-color none

If this doesn't speak to you basically you would download the .py file, and run one of the examples above. The --all flag makes all possible outputs (1, 2, 3 or 4 panel in all layouts). I also plan on adding "--reverse" to handle right-to-left order. you can see the examples in the imgur link, I included the sample code for some of them.

Deepseek wrote all of this, and it only costs like... a dollar or so. It's not the best at coding but it's so, so cheap that I fear I may spend all my savings on it lol. I didn't write a single line of code, just told it what I wanted (with more or less details) and it got it right on the first try. Now I'm just adding new functionalities to it.

I think one HUGE thing with LLM coding is it can go beyond what you know, so even if you can code it can do things you wouldn't necessarily do or know about.

I think this has uses for agitprop, if you want to make quick memes in a comic format for online or offline use. This is just one part of potentially a much bigger pipeline - I was thinking of having a simple gui afterwards you can send the output to, where you can add speech bubble stickers and write into them too, or like for one panel comics just write under it. Also with LLMs you can easily transpose this code to use javascript instead, or whatever else you want.

Oh if you send me a few pictures to comicify I can run them and show you the results. Script will not be available for download until it's fully ready.

 

thankfully the two websites I use aren't edit: chatGPT is down but deepseek isn't... scratch one more chinese victory.

 

Bonjour à tous.tes,

Petit message en français pour vous dire qu'on a traduit ProleWiki vers le français depuis l'instance anglais et que nous cherchons maintenant des éditeurs.trices !

Il reste pas mal de boulot pour finir l'intégration des nouvelles pages, et j'ai préparé un petit guide qui explique où vous pouvez nous aider avec vos contributions : https://fr.prolewiki.org/wiki/Essai:Comment_aider_sur_ProleWiki_(fran%C3%A7ais) (que je vais certainement encore remplir et essayer de simplifier)

N'hésitez pas à partager un maximum, on cherche vraiment à faire vivre l'instance et la rendre autonome. Et j'espère vous voir sur ProleWiki !

 

Obviously this is still in what could be considered "late beta", but the pipeline was a huge success. https://fr.prolewiki.org/

The translation quality is honestly very good, we picked the right model and prompt for this.

This got us I would say 75-80% of the way there, the remaining % points are busywork that you won't escape, or at least I don't know how to automate it...

Think of it this way, ProleWiki EN has 5 years of organic content being written over time with links and page redirects being made. We are starting from 0. So, currently, most pages have redlinks (here's a benchmark one: https://fr.prolewiki.org/wiki/Cor%C3%A9e) because the redirects are not created. The pages exist, it's just that the links should to go, say, "Kim Il Sung" instead of "Kim Il-Sung". Normally you'd create a redirect like Wikipedia does, i.e. Kim Il-Sung takes you to Kim Il Sung. But we don't have that history so we have to create them.

We could have exported the redirects but I decided against it because it would probably be a bigger headache. Same for the templates, we're going to run them through Deepseek as needed.

Aside from that we focused on getting the triad of homepages (Home/Library/Essays) cleaned up and ready to go. Here's the essays for example: https://fr.prolewiki.org/wiki/ProleWiki:Essais

I'm hopeful that with this out of the way we will get new editors and even anonymous editors interested in participating (tomorrow I think I will open up anonymous editing on the French instance to every namespace). It'll take some time to finish cleaning everything up and tbh even the english instance isn't completely pristine. I saw some pages that I didn't even know existed and were clearly test pages from 2020 lol.

Obviously fixing these red links is not going to happen overnight, we're in for the long haul. But we got 80% out of the way in a week.

I learned some practices in regards to this pipeline, things I would do differently. Tbh we were getting kinda antsy to get this up and running. But if we were to redo this for other languages I would do some things a bit different to save on the headache.

The pipeline was: download all PW pages through API -> Run through LLM to translate from EN to FR -> use regex script to clean up translation artifacts -> upload to website.

Simple enough in theory but not so small in practice, esp. the regex to clean up the translation artifacts.

edit - oh yeah, total time from start to finish was exactly 1 week (Saturday to Saturday). This is the power of LLMs lol, you just have to find the right one and prompt it. Funnily enough I think the smaller models did a better job than the bigger models. Contrast to what 5 sleep-deprived tankies could have achieved lol

view more: next ›