2026-05-10 13:50 GMT+8 · summary_2026-05-10_13-50.md

🤖 AI News Summary - 2026-05-10 13:50 GMT+8

Focused AI/dev subreddit roundup.

Full site: https://kkklobsterfarming.github.io/ai-news-summary-site/

What changed since last run

Markdown browser for LLMs with MCP — r/llmdevs
What are you using for STT and TTS (voice chats)? — r/OpenWebUI
model: add sarvam_moe architecture support by sumitchatterjee13 · Pull Request #20275 · ggml-org/llama.cpp — r/LocalLLaMA

r/openai

AI is coming for our jobs and also it can’t pay its own electricity bills
- For two years we’ve been told AI is coming for our jobs. Lawyers, coders, writers, designers, everyone’s apparently on borrowed time.
- Timestamp: 2026-05-10 03:15 GMT+8
- Community: Community reaction (heuristic-fallback): The comment section is split between joking and positive. Top reactions focus on Plenty of world-changing tech burned cash for years while the economics caught up. They’re building massive infrastructure for demand… | In a long term that’s true, it’s inevitable. I’m just…
- Author: /u/Ashiq_Luxline

r/LocalLLaMA

Running Minimax 2.7 at 100k context on strix halo
- [Image: Running Minimax 2.7 at 100k context on strix halo] Just wanted to share because it took me a lot of tweaking to get here: `llama-server -hf unsloth/MiniMax-M2.7-GGUF:UD-IQ3_XXS –temp 1.0 –top-k 40 –top-p 0.95 –host 0.0.0.0…
- Timestamp: 2026-05-10 04:21 GMT+8
- Community: Community reaction (heuristic-fallback-http-403): The comment section is split between positive and skeptical. Top reactions focus on Are you sure it’s a good idea to specify –cache-ram 0 ? I believe it makes agentic workflow very slow. 2048 would be enough. Also you can… | Why do you think --cache-ram 0 will…
- Author: /u/Zc5Gwu
Exactly a year ago, I started working on an MCP server I launched on reddit that became by far my most active open source project!
- [Image: Exactly a year ago, I started working on an MCP server I launched on reddit that became by far my most active open source project!] This isn’t an advertisement, and it’s very much local and open - I already don’t have enough time…
- Timestamp: 2026-05-10 06:08 GMT+8
- Community: Community reaction (heuristic-fallback): The comment section is split between joking and positive. Top reactions focus on Do share! I find at this point the main thing I need in my chat ui is email, calendar and todo list - I’ve honestly stopped using context7… | Not much to share - I made a little MCP server that…
- Author: /u/taylorwilsdon
After you’ve setup local models, where can you find interesting apps that can use them?
- I have Qwen3.6-27B as my main model, I use it for coding with opencode and chatting with open-webui, yet to try out hermes or openclaw. I found out about their existence basically by searching or through reddit - but maybe there’s more…
- Timestamp: 2026-05-10 01:25 GMT+8
- Community: Community reaction (heuristic-fallback): The comment section is mostly positive. Top reactions focus on Here are some self-hosted apps I get llama.cpp usage out of with local models. Home Assistant for smart home stuff, Paperless-ngx for… | Thanks for sharing the list you use, paperless-ngx and n8n look interesting,…
- Author: /u/ReferenceOwn287
BeeLlama.cpp: advanced DFlash & TurboQuant with support of reasoning and vision. Qwen 3.6 27B Q5 with 200k context on 3090, 2-3x faster than baseline (peak 135 tps!)
- [Image: BeeLlama.cpp: advanced DFlash & TurboQuant with support of reasoning and vision. Qwen 3.6 27B Q5 with 200k context on 3090, 2-3x faster than baseline (peak 135 tps!)] TL;DR New llama.cpp fork!
- Timestamp: 2026-05-10 00:05 GMT+8
- Community: Community reaction (heuristic-fallback): The comment section is mostly positive. Top reactions focus on Did the MRs for this get rejected on the original llama.cpp, or is the the MR flow just so slow (read: “takes a week”) that it made more… | Thanks for making it happen still. Yes, the AI policy is a rather…
- Author: /u/Anbeeld
model: add sarvam_moe architecture support by sumitchatterjee13 · Pull Request #20275 · ggml-org/llama.cpp
- [Image: model: add sarvam_moe architecture support by sumitchatterjee13 · Pull Request #20275 · ggml-org/llama.cpp] Sarvam-30B is an advanced Mixture-of-Experts (MoE) model with 2.4B non-embedding active parameters, designed primarily for…
- Timestamp: 2026-05-10 02:46 GMT+8
- Community: Community reaction (heuristic-fallback): The comment section is mostly positive. Top reactions focus on Omg! This took really long time. Finally it sees the light of day! I guess too less too late. | Looks like the hype is gone, I was wondering does anyone still remember this model 😉. Overall sentiment — post:…
- Author: /u/jacek2023

r/llmdevs

Markdown browser for LLMs with MCP
- I modified the textweb renderer built by u/cdr420 (https://www.reddit.com/user/cdr420/) (https://www.reddit.com/r/LocalLLaMA/comments/1r90b3a/textweb_render_web_pages_as_25kb_text_grids/…
- Timestamp: 2026-05-10 13:27 GMT+8
- Author: /u/DocWolle

r/OpenWebUI

What are you using for STT and TTS (voice chats)?
- Hi all, My STT & TTS setup is poor in Open Web UI. What I really want is chatgpt.com (http://chatgpt.com) levels of voice support (STT and TTS).
- Timestamp: 2026-05-10 13:38 GMT+8
- Author: /u/PersianMG

r/selfhosted

No non-pinned/newsworthy posts fetched after filtering.

r/ClaudeAI

No non-pinned/newsworthy posts fetched after filtering.

r/ClaudeCode

How are you all mitigating the WEEKLY usage tight quota issue? Here’s my latest idea…
- Here’s something I’m trying for my overnight run for my latest work, using GSD v1 in the Claude Code CLI harness: (I’m at 30% usage left for the week, and 2.5 more days before this overnight run… I’m going to run out on a Max20 plan, as…
- Timestamp: 2026-05-10 12:16 GMT+8
- Community: Community reaction (heuristic-fallback): The comment section is split between joking and positive. Top reactions focus on PS: early outcomes for phase one are already in… this looks promising! I do hope it pans out. Pray for me lol. — Both unsolicited…. Overall sentiment — post: mixed; author: mixed. Reply…
- Author: /u/N3TCHICK

r/Codex

Thinking of switching from Claude Pro to ChatGPT Plus for coding. Thoughts?
- 👋 I’ve been using Claude Pro for a while now, but I’m considering making the jump over to a ChatGPT Plus subscription. I mainly use AI for vibe coding, and I’d love to hear from anyone who has recently used both.
- Timestamp: 2026-05-10 13:00 GMT+8
- Community: Community reaction (heuristic-fallback): The comment section is mostly positive. Top reactions focus on Give it a shot for a month - I think you’ll be pleasantly surprised. Made the move myself for the 100$ plans and I’ve enjoyed Codex as an… | I’ve used both extensively for scientific programming. Codex’s rate…
- Author: /u/duckhtn89
Token costs pushed me toward multi-agent orchestration. Is that happening to anyone else?
- I wrote up the reason I moved toward multi-agent orchestration: token economics. The core issue is that every serious agentic workflow can become expensive when one strong model carries every step: context loading, source mapping,…
- Timestamp: 2026-05-10 02:01 GMT+8
- Community: Community reaction (heuristic-fallback): The comment section is split between concerned and positive. Top reactions focus on I’ve been building my own claw/hermes agent (before I ever knew what those were) and hooked it up to fall back to free llms with a little… | The $100 2x deal rn is solid af. I’ll be switching…
- Author: /u/kalensr

Generated 2026-05-10 13:50 GMT+8 | Next update in 2 hours