πŸ€– AI News Summary
2026-05-04 22:11 GMT+8 Β· summary_2026-05-04_22-11.md

πŸ€– AI News Summary - 2026-05-04 22:11 GMT+8

Focused AI/dev subreddit roundup.

Full site: https://kkklobsterfarming.github.io/ai-news-summary-site/

What changed since last run


r/openai

  • No non-pinned/newsworthy posts fetched after filtering.

r/LocalLLaMA

  1. A Qwen finetune, that feels VERY human

    • Hello guys, So TL;DR, I was asked by multiple people to make an Assistant_Pepe_32B version, but the best base model contender was Qwen3-32B, a model that is very hard to tune on anything other than STEM. The concept of Assistant_Pepe is an…
    • Timestamp: 2026-05-04 01:20 GMT+8
    • Community: Community reaction (heuristic-fallback): The comment section is split between joking and positive. Top reactions focus on Why qwen 3 and not 3.6? Also, make the ggufs so that people can test them to see if they’re actually any better than the base model | Try to break the IBM newest model. Fuck it why not bro…..
    • Author: /u/Sicarius_The_First
  2. Pushing a 5-Year-Old 6GB VRAM laptop to Its Limits: Qwen3.6-35B-A3B

    • For the past few weeks, I have been trying to get this model working on my hardware. It still feels incredible how much better open models have become.
    • Timestamp: 2026-05-04 06:16 GMT+8
    • Community: Community reaction (heuristic-fallback): The comment section is mostly positive. Top reactions focus on Very nice work. My main dev machine is an old Dell XPS w/ a 2060, 64GB of RAM - basically just a front end for the inference server running… | I’m running models okay on 32gigs ddr4 3200 and a 6600xt(8gigs)….
    • Author: /u/abhinand05
  3. How much will it cost to host something like qwen3.6 35b a3b in a cloud?

    • I keep hearing the model is good, I don’t have the hardware for it, and I will wait to the end of the year for the hardware to evolve. But, I still need coding, people are saying qwen3.6 35b a3b is good, so the question is now how much…
    • Timestamp: 2026-05-04 07:47 GMT+8
    • Community: Community reaction (heuristic-fallback): The comment section is split between concerned and joking. Top reactions focus on Don’t lol. If you’re going to use a model in the cloud you might as well use the subsidized models that are extremely cheap like Minimax,… | This this this deepseek v4 flash is basically free at…
    • Author: /u/Euphoric_North_745
  4. Mistral-Medium-3.5-128B-Q3_K_M on 3x3090 (72GB VRAM)

    • [Image: Mistral-Medium-3.5-128B-Q3_K_M on 3x3090 (72GB VRAM)] Here is the actual speed of Mistral Medium Q3 running locally on 3x3090 first some Python…
    • Timestamp: 2026-05-04 08:46 GMT+8
    • Community: Community reaction (heuristic-fallback): The comment section is mostly positive. Top reactions focus on Pelican is not great. But I believe svg benchmarks are overfitted anyways. | Yes, I wanted to check quality of Q3 quant. At least all three files are valid code. Overall sentiment β€” post: positive; author: mixed….
    • Author: /u/jacek2023
  5. What a time to be alive from 1tk/sec to 20-100tk/sec for huge models

r/llmdevs

  1. Added Ollama / LM Studio / llama.cpp support to my dataset generator app β€” fine-tune your model fully offline (or mix local + cloud)

    • [Image: Added Ollama / LM Studio / llama.cpp support to my dataset generator app β€” fine-tune your model fully offline (or mix local + cloud)] A while back I shipped a desktop app that generates fine-tuning datasets via OpenRouter. Got my…
    • Timestamp: 2026-05-04 01:22 GMT+8
    • Community: Community reaction (heuristic-fallback): The comment section is mostly positive. Top reactions focus on the token accounting thing with think blocks is real. ollama not separating reasoning_tokens properly messed up my budgets too | Yeah, openrouter actually breaks reasoning_tokens out in completion_tokens_details
    • Author: /u/AronSan
  2. How to optimise my OpenAI API response time? (gpt-4o-mini)

    • I’m currently using gpt-4o-mini as the model for my openai api in my project. Even getting a response from a short prompt such as “What is your name?” takes 5-10 seconds.
    • Timestamp: 2026-05-04 19:08 GMT+8
    • Community: Community reaction (heuristic-fallback): Top reactions focus on I see. What do you think could be the issue? My connection couldn’t be the issue either; I constantly get around 300 MBPS. I’d really… | To see if it’s your network you could do a ping against OpenAI’s API server. To see if it’s your app framework write…
    • Author: /u/FindingOk1094

r/OpenWebUI

  1. Made a Skill for creating Open Webui Tools, try it with Qwen3.6

  2. Back to Basics: making the scroll bar visible

    • One of the most annoying things about Open WebUI is the scroll bar - or the lack of it. As the page gets longer with a response, the scroll bar doesn’t appear until you start randomly clicking on the right edge of the page, and you might…
    • Timestamp: 2026-05-04 18:03 GMT+8
    • Community: Community reaction (heuristic-fallback): Top reactions focus on custom css can be put in static folder, but i agree its borderline invisible, i will open an issue to track this. Overall sentiment β€” post: mixed; author: mixed. Reply threads: 2026-05-04 18:20 GMT+8: post=mixed, author=mixed β€” custom css can be put in…
    • Author: /u/BringOutYaThrowaway
  3. Knowledge base vs LLM WIKI? How best to implement “context caching”?

    • Has anyone implemented Karpathy LLM wiki idea in Open Web UI with knowledge base? I used open terminal and qwen3-coder-next to implement a folder structure there.
    • Timestamp: 2026-05-04 12:27 GMT+8
    • Community: Community reaction (heuristic-fallback): Top reactions focus on You’re talking about the Open Terminal integration (github.com/open-webui/open-terminal (https://github.com/open-webui/open-terminal)) with… | Knowledge base is RAG basically. People are interested in LLM Wiki because they want something different. You…
    • Author: /u/Last_Bad_2687

r/selfhosted

  1. Self-hosted document & email search: Need a lightweight RAG indexer with hybrid search

    • Hi everyone, I am looking for a locally hostable application to get a comprehensive search across all my documents, emails, and files. I am currently using Paperless, a self-hosted mail server that fetches and stores all my emails via…
    • Timestamp: 2026-05-04 14:00 GMT+8
    • Community: Community reaction (heuristic-fallback): The comment section is mostly positive. Top reactions focus on Expand the replies to this comment to learn how AI was used in this post/project. | no one really seems to have done this yet. from what I understand you would have to give Paperless a pretty significant refactoring…
    • Author: /u/Alarmed_Bug3762
  2. Arr Stack Question?

    • Need some best practice advice on building out a media server. I already have QBittorrent and Jellyfin installed in separate Promox LXCs.
    • Timestamp: 2026-05-04 15:00 GMT+8
    • Community: Community reaction (heuristic-fallback): The comment section is mostly positive. Top reactions focus on Expand the replies to this comment to learn how AI was used in this post/project. | Separation is good. I like having my containers in various VMs. No LXC at all. Using Ansible/Terraform/Docker Swarm to automate and…
    • Author: /u/pagem
  3. n8n + Paperless-ngx + Paperless-GPT for adding RAG to your documents!

    • [Image: n8n + Paperless-ngx + Paperless-GPT for adding RAG to your documents!] Paperless-ngx is undoubtedly one of the most important and useful containers in my self-hosted stack. I have a modest collection of documents, ranging from…
    • Timestamp: 2026-05-04 16:06 GMT+8
    • Community: Community reaction (heuristic-fallback): The comment section is mostly positive. Top reactions focus on Expand the replies to this comment to learn how AI was used in this post/project. | That’s what I’m implementing actually. If I only would have enough time.. Overall sentiment β€” post: positive; author: mixed. Reply…
    • Author: /u/hackslashX
  4. Postiz Self-Hosted - All working, but API access does not

    • I have self-hosted Postiz on my server at home via Docker. I have it published to the Internet via NGINX Proxy Manager.
    • Timestamp: 2026-05-04 16:58 GMT+8
    • Community: Community reaction (heuristic-fallback): The comment section is mostly positive. Top reactions focus on Expand the replies to this comment to learn how AI was used in this post/project. | Sounds like a proxy or config issue rather than Postiz itself. Make sure NGINX Proxy Manager is forwarding the correct path to the…
    • Author: /u/Patient_Scale9438
  5. Speakr v0.8.19 - Local audio/video transcription app update

    • [Image: Speakr v0.8.19 - Local audio/video transcription app update] Hey r/selfhosted (/r/selfhosted), another Speakr update. If you haven’t seen this before, Speakr is a self-hosted audio transcription app: record or upload audio/video,…
    • Timestamp: 2026-05-04 14:49 GMT+8
    • Community: Community reaction (heuristic-fallback): The comment section is split between positive and skeptical. Top reactions focus on Expand the replies to this comment to learn how AI was used in this post/project. | ok dumb question. if i run this as a docker on my server, how do i transcribe, f.e. a meeting in ms teams on…
    • Author: /u/hedonihilistic

r/ClaudeAI

  1. Reminder: Have you checked your context lately?
    • [Image: Reminder: Have you checked your context lately?] Just a reminder to run /context. I like to think I was on top of this!
    • Timestamp: 2026-05-04 03:19 GMT+8
    • Community: Community reaction (heuristic-fallback): The comment section is split between concerned and joking. Top reactions focus on When you start up a conversation, Claude code will pull in your plugins, mcp, extra data for you prompt so it knows to use them. These will… | This is the response I immediately expected coming…
    • Author: /u/Arona_Daal

r/ClaudeCode

  1. Building an Auto-Restart Mechanism for Claude Code
    • [Image: Building an Auto-Restart Mechanism for Claude Code] Claude Code requires a manual session restart every time you install an MCP server or change a config, which breaks your momentum. I built claude-resurrect to fix this.
    • Timestamp: 2026-05-04 14:50 GMT+8
    • Author: /u/emnoleg

r/Codex

  • No non-pinned/newsworthy posts fetched after filtering.

Generated 2026-05-04 22:11 GMT+8 | Next update in 2 hours