2026-05-01 22:10 GMT+8 · summary_2026-05-01_22-10.md

🤖 AI News Summary - 2026-05-01 22:10 GMT+8

Focused AI/dev subreddit roundup.

Full site: https://kkklobsterfarming.github.io/ai-news-summary-site/

What changed since last run

Websearch API / omlx - owu - home assistant — r/OpenWebUI
Open Relay v3.4 — Native Live Inline Visualizations are here! 🎉 — r/OpenWebUI
Got DFlash speculative decoding working on Qwen3.5-35B-A3B with an RTX 2080 SUPER 8GB — r/LocalLLaMA
Whats the latest status on 7900xtx multi-GPU setups? — r/LocalLLaMA
I’ve spent the last few months building an open specification for compiled, queryable team knowledge that any AI agent can read from. v0.1.0 is live, looking for feedback and testing! — r/llmdevs
Self-hosted LLM on GCP (1×H100 + 1×L4) for legal RAG in European languages — looking for advice — r/llmdevs
Does Authentik support only one webfinger-discoverable OIDC issuer href for multiple applications per hostname? — r/selfhosted
Example of using the GodotIQ MCP with Coding Agent to create video games — r/ClaudeAI
Running llama.cpp on Snapdragon Hexagon NPU seems promising — r/LocalLLaMA
Found Zero day Claude Desktop + Chromium bug need to know where to submit report. — r/ClaudeAI
Prompt insertion of Excel/CSV files — r/OpenWebUI
Sharing expierence and advice request — r/OpenWebUI

r/openai

No non-pinned/newsworthy posts fetched after filtering.

r/LocalLLaMA

Got DFlash speculative decoding working on Qwen3.5-35B-A3B with an RTX 2080 SUPER 8GB
- `## Got DFlash speculative decoding working on Qwen3.5-35B-A3B with an RTX 2080 SUPER 8GB I managed to get DFlash speculative decoding working in llama.cpp on a pretty VRAM-limited setup. This was tested with the DFlash PR:…
- Timestamp: 2026-05-01 19:49 GMT+8
- Community: Community reaction (heuristic-fallback): The comment section is mostly positive. Top reactions focus on can you show me the link to the draft model Qwen3.5-35B-A3B-DFlash-Q4_K_M.gguf? Thanks | `https://github.com/ggml-org/llama.cpp/pull/22105 ` I mainly use this PR and it’s instructions:. Overall sentiment…
- Author: /u/jwestra
Whats the latest status on 7900xtx multi-GPU setups?
- I am currently running dual RTX 5060 ti 16gb (both of which are easy to sell or re-use in other PCs at home) and monitoring the used market for more of the same and alternatively RTX 3090. I couldn’t help but notice that sometimes some…
- Timestamp: 2026-05-01 19:54 GMT+8
- Community: Community reaction (heuristic-fallback): The comment section is split between critical and joking. Top reactions focus on Tensor parallelism works in vllm Split mode tensor works in llama.cpp with ROCm but not Vulkan. It’s still occasionally flaky so I often… | Too bad the prices of 7900xtx caught on to this…..
- Author: /u/ziphnor
Running llama.cpp on Snapdragon Hexagon NPU seems promising
- https://github.com/ggml-org/llama.cpp/blob/master/docs/backend/snapdragon/README.md (https://github.com/ggml-org/llama.cpp/blob/master/docs/backend/snapdragon/README.md) I have an Oneplus 12 with Snapdragon 8 Gen 3. I followed the above…
- Timestamp: 2026-05-01 13:26 GMT+8
- Community: Community reaction (heuristic-fallback): The comment section is split between critical and positive. Top reactions focus on i have sd elite (i guess its gen4) oneplus 13 and my experience of running it was not good. i was mostly interested in qwen3.5 9B Q4 model… | Thanks for your input. What kind of number do you…
- Author: /u/Ok_Warning2146
Open Models - April 2026 - One of the best months of all time for Local LLMs?
- [Image: Open Models - April 2026 - One of the best months of all time for Local LLMs?] Any underrated or overlooked models? FYI MiniMax-M2.7 switched their license(from MIT to Non-Commercial) so it’s not in graph.
- Timestamp: 2026-05-01 03:50 GMT+8
- Community: Community reaction (heuristic-fallback): The comment section is mostly positive. Top reactions focus on I found qwen3.5 122b q5 to be much worse than qwen3.6 17b q5 and even qwen3.6 35b q5. However i am extremely excited to try out qweb3.6… | What hardware are you using the run these massive models at home?. Overall…
- Author: /u/pmttyji
Qwen 3.6 27B vs Gemma 4 31B - making Packman game!
- [Image: Qwen 3.6 27B vs Gemma 4 31B - making Packman game!] Gemma just crushed Qwen in a local LLM gamedev contest! Device: MacBook Pro M5 Max, 64GB RAM Qwen 3.6 27B: 32 tokens/sec · 18m 04s · 33,946 tokens.
- Timestamp: 2026-05-01 09:03 GMT+8
- Community: Community reaction (heuristic-fallback): The comment section is mostly critical. Top reactions focus on Keep performance stable and no bugs are pretty hilarious additions to the prompt. | I’ve shared it before but my secret source when prompting is: IMPORTANT: You are an expert coder turned professor, however you have…
- Author: /u/gladkos

r/llmdevs

I’ve spent the last few months building an open specification for compiled, queryable team knowledge that any AI agent can read from. v0.1.0 is live, looking for feedback and testing!
- The problem is something I’ve watched people at work and in the community try to solve over and over in different ways: Team Knowledge Hubs, Local RAG for development environments, one-off retrieval pipelines bolted onto Confluence….
- Timestamp: 2026-05-01 00:35 GMT+8
- Community: Community reaction (heuristic-fallback): The comment section is mostly positive. Top reactions focus on Knowledge Hub”, local RAG, or one‑off Confluence index keeps tripping over: each tool tries to maintain its own worldview. Different teams,… | Hi JDubbs, I like your standard of practice/schema approach. I’m…
- Author: /u/JDubbsTheDev
Self-hosted LLM on GCP (1×H100 + 1×L4) for legal RAG in European languages — looking for advice
- Self-hosted LLM on GCP (1×H100 + 1×L4) for legal RAG in European languages — looking for advice Hey, I’m planning to migrate a production RAG system from Azure OpenAI (currently using 4o + 4.1 for different agents) to a self-hosted setup…
- Timestamp: 2026-05-01 20:08 GMT+8
- Community: Community reaction (heuristic-fallback): The comment section is split between concerned and skeptical. Top reactions focus on What drives the cost? What does the processing mean? If it is mostly chunking and embedding then it is unlikely to he the cost driver. | Fair question. Breakdown is roughly 30% OCR (external…
- Author: /u/Candy_Lucy

r/OpenWebUI

Websearch API / omlx - owu - home assistant
- My goal is to use a local LLM with Home Assistant. oMLX provides the backend, and Open WebUI acts as the API proxy to Home Assistant.
- Timestamp: 2026-05-01 15:43 GMT+8
- Community: Community reaction (heuristic-fallback): The comment section is mostly positive. Top reactions focus on I think only RAG works from owui when you proxy through, but not tool calls etc.. what you could do would be to use fast API and Langchain… | Home assistant has mcp capabilities for models. If you had an MCP for…
- Author: /u/pr3ddi
Open Relay v3.4 — Native Live Inline Visualizations are here! 🎉
- [Image: Open Relay v3.4 — Native Live Inline Visualizations are here! Version 3.4 of Open Relay is live bring the latest live visualizations to the app natively!
- Timestamp: 2026-05-01 05:04 GMT+8
- Community: Community reaction (heuristic-fallback): The comment section is mostly positive. Top reactions focus on 💪💪💪💪💪💪💪💪💪💪💪💪💪 That’s how Open WebUI grows. True builders. | 100%! Never thought of doing it even though it feels so obvious haha. Literally made the entire render pipeline live instead of code ->…. Overall…
- Author: /u/Zealousideal_Fox6426
Prompt insertion of Excel/CSV files
- Hello, I am currently using OpenWebUI at work and I would like to better understand how Excel and CSV files are inserted into my users’ prompts. So far, I have the impression that when someone uploads a file of this type, it gets converted…
- Timestamp: 2026-05-01 00:30 GMT+8
- Author: /u/Alternative-Run6265
Sharing expierence and advice request
- I have been using OpenWebUI for a while with local models but seems like I am missing something. There are so many features but nome of them feels native for me.
- Timestamp: 2026-05-01 03:36 GMT+8
- Community: Community reaction (heuristic-fallback): The comment section is split between critical and positive. Top reactions focus on Obligatory “have you turned on Native function tool calling?” | Yee but this strange. I didint found it as a default. You need to go to specific model and turn on. But when you open chat there is…
- Author: /u/Right-Ice-6850

r/selfhosted

Does Authentik support only one webfinger-discoverable OIDC issuer href for multiple applications per hostname?
- I like Authentik and it’s what I use but it seems like it doesn’t support per application OIDC via a global application agnostic issuer href using webfinger which seems to basically mean you can only have one OIDC application per hostname…
- Timestamp: 2026-05-01 10:08 GMT+8
- Community: Community reaction (heuristic-fallback): Top reactions focus on Expand the replies to this comment to learn how AI was used in this post/project. | Pretty much either use separate hostnames per OIDC discovered app or serve a custom webfinger json behind your reverse proxy. I like…. Overall sentiment — post: mixed;…
- Author: /u/-jsteinke
Guidance with tailscale + coolify + cloudflare
- I was managing a single raspberry pi with coolify and cloudflared configured. I had a couple of services running (some public, some private in my home network).
- Timestamp: 2026-05-01 14:52 GMT+8
- Community: Community reaction (heuristic-fallback): The comment section is mostly positive. Top reactions focus on Expand the replies to this comment to learn how AI was used in this post/project. | Cloudflare Tunnel fixes the 443 conflict, route headscale and coolify to different local ports via subdomains. Keep all DBs/state…
- Author: /u/Blue_Dude3

r/ClaudeAI

Example of using the GodotIQ MCP with Coding Agent to create video games
- [Image: Example of using the GodotIQ MCP with Coding Agent to create video games] I’ve developed an MCP server for Godot. Most coding agents in Godot today work blindly: they read files but don’t know where the nodes are in space, don’t…
- Timestamp: 2026-05-01 19:04 GMT+8
- Community: Community reaction (heuristic-fallback): Top reactions focus on Just FYI Godot is strongly anti AI and will attack this.. Overall sentiment — post: mixed; author: mixed. Reply threads: 2026-05-01 21:15 GMT+8: post=mixed, author=mixed — Just FYI Godot is strongly anti AI and will attack this. | 2026-05-01 19:19 GMT+8:…
- Author: /u/jf_nash
Found Zero day Claude Desktop + Chromium bug need to know where to submit report.
- Looking for official link / process to submit a vulnerability report for a high-risk official Claude Desktop + Chrome extension + native host + Cowork/MCP configuration that can become RAT-equivalent if a session, prompt chain, same-user…
- Timestamp: 2026-05-01 21:52 GMT+8
- Community: Community reaction (heuristic-fallback): The thread has active discussion, but no clear consensus stood out. Overall sentiment — post: mixed; author: mixed. Reply threads: 2026-05-01 22:08 GMT+8: post=mixed, author=mixed — Ask claude instead of looking for attention here
- Author: /u/ChangeGlittering1800
Spent an evening making a launch video with Claude + Blender MCP
- [Image: Spent an evening making a launch video with Claude + Blender MCP] Solo dev working on a habit tracker app (Spira — habits become flowers that bloom over time). Needed a 10s vertical video for App Store / TikTok and didn’t have a…
- Timestamp: 2026-05-01 05:34 GMT+8
- Community: Community reaction (heuristic-fallback): Top reactions focus on Can you export the conversation and share it please?. Overall sentiment — post: mixed; author: mixed. Reply threads: 2026-05-01 20:25 GMT+8: post=mixed, author=mixed — Can you export the conversation and share it please?
- Author: /u/Positive_Camel2086

r/ClaudeCode

No non-pinned/newsworthy posts fetched after filtering.

r/Codex

No non-pinned/newsworthy posts fetched after filtering.

Generated 2026-05-01 22:10 GMT+8 | Next update in 2 hours