2026-06-27 13:20 GMT+8 · summary_2026-06-27_13-20.md

🤖 AI News Summary - 2026-06-27 13:20 GMT+8

Focused AI/dev subreddit roundup.

Full site: https://ai-news-summary.pages.dev/

What changed since last run

Disabling automatic transcription of uploaded audio files — r/OpenWebUI
Qwen3:14b vs Gemma4:26b for tool calling — what’s your experience? — r/OpenWebUI
Ornith-1.0-35B Q3_K_M: ~17 GB VRAM, KLD-checked against BF16 — r/LocalLLaMA
Gemma 4 12B refuses to work since turning function calling on! — r/OpenWebUI
vulkan: make TP viable by pwilkin · Pull Request #25051 · ggml-org/llama.cpp — r/LocalLLaMA
Context Recycling for Long-Horizon LLM Inference (ContextForge) + LLM Wiki + long-horizon benchmark results — r/llmdevs
Jcorp Nomad: A Self Hosted media server that fits in your pocket! — r/selfhosted
What self-hosted apps do you actually use every day? — r/selfhosted
750 tps on GPT 5.6 Sol, INSANE — r/openai
GPT 5.6 “sol” announced — r/Codex
GPT-5.6 Officially Previewed: Beats Mythos 5. — r/Codex
I used Claude to fix my biggest frustration with PDFs — r/ClaudeAI

r/openai

#	Post	Summary	Time	Score	Author	Community reaction
1	750 tps on GPT 5.6 Sol, INSANE	[Image: 750 tps on GPT 5.6 Sol, INSANE] https://preview.redd.it/m1q4synsqo9h1.png?width=932&format=png&auto=webp&s=fab697901b42720ab4b42d00d10bc456cacb30f4 (https://preview.redd.it/m1q4synsqo9h1.png?width=932&format=png&auto=webp&s=fab697901b42720ab4b42d00d10bc456cacb30f4) This is too crazy right? Such high TPS on a…	2026-06-27 04:19 GMT+8		/u/VivaLaRay1	Community reaction (frontier/gpt-5.4-mini): Commenters largely treated 750 TPS on GPT-5.6 Sol as a massive jump over the cited 5.5 “extra high” figure of 68 tokens/sec, or about 102 tokens/sec in Codex fast mode with a 1.5x speedup, so the raw throughput claim landed as genuinely impressive. The main caveat was that agent wall-clock time is often dominated by tool/command execution like compilation, so faster token generation does not automatically make end-to-end workflows 7x faster; on the upside, people thought this kind of speed could finally make browser/computer use more practical, though one user noted Playwright CLI automation is already pretty good and just very token-hungry. Overall sentiment — post: positive; author: neutral. Reply threads: 2026-06-27 07:14 GMT+8: post=positive, author=neutral — They estimated GPT-5.5 extra-high at 68 tokens per second and Codex fast mode at about 102 tokens per second,… \| 2026-06-27 10:49 GMT+8: post=concerned, author=neutral — They argued that agent latency is often dominated by tool and command execution such as compilation, so… \| 2026-06-27 04:58 GMT+8: post=positive, author=neutral — They said browser or computer use will finally be usable with this level of speed.
2	This is how you do business, not fear mongering. We will be rocking this model by next week	[Image: This is how you do business, not fear mongering.	2026-06-27 08:40 GMT+8		/u/py-net	Community reaction (frontier/gpt-5.4-mini): Commenters focus less on the model itself than on trust and timing: one correction says the blog only promised release in “coming weeks,” not next week, while several others argue OpenAI’s history makes its claims hard to trust and that lying would cost credibility. A smaller countercurrent defends the company as consistently messaging this way and says its security claims have not yet been disproven, but even that support is qualified by concern about making big security claims on a shaky trust baseline; the practical takeaway is to treat launch dates and safety claims as uncommitted until the official wording is explicit or independently verified. Overall sentiment — post: skeptical; author: neutral. Reply threads: 2026-06-27 08:50 GMT+8: post=critical, author=neutral — Corrects the timeline, saying the blog only promises the model is coming in “coming weeks” rather than next… \| 2026-06-27 09:04 GMT+8: post=critical, author=neutral — Says lying may help in the short term, but it costs trust. \| 2026-06-27 09:48 GMT+8: post=critical, author=neutral — Says it is baffling that anyone believes OpenAI anymore and argues that a pattern of dishonesty makes big…

r/LocalLLaMA

#	Post	Summary	Time	Score	Author	Community reaction
1	Ornith-1.0-35B Q3_K_M: ~17 GB VRAM, KLD-checked against BF16	I quantized deepreinforce-ai/Ornith-1.0-35B down to Q3_K_M so it fits comfortably on a single GPU. Produced locally with llama-quantize from the upstream BF16 GGUF — the quantizer took it from 16.01 BPW down to 3.87 BPW, landing at 16.8 GB on disk / ~17 GiB loaded VRAM, about 21% smaller than Q4_K_M.	2026-06-27 10:30 GMT+8	16	/u/Blahblahblakha	Community reaction (frontier/gpt-5.4-mini): Commenters mostly treated the Q3_K_M release as a practical win: the KL/top-1 validation and ~17 GiB VRAM footprint were cited as making a 35B model usable on consumer cards, and one user reported it working well with an Opus planner plus Q3 executor setup. The main caveats were skepticism about whether the model is actually good versus just another hyped fine-tune, confusion over provenance (one commenter called it a quantized original 35B while another said the underlying model is a Qwen 35B finetune), and a hardware tradeoff explanation that 3-bit weights stay packed at 3.87 bits/weight, unpack to fp16 for matmul, and only help when memory bandwidth is the bottleneck, with Q4 sometimes still winning under load. Overall sentiment — post: positive; author: positive. Reply threads: 2026-06-27 10:44 GMT+8: post=positive, author=positive — They praised the KL and top-1 validation, said 17 GB Q3 with 80+ percent top-1 makes 35B usable on consumer… \| 2026-06-27 12:27 GMT+8: post=skeptical, author=neutral — They asked whether the model is actually good because they have downloaded too many hyped fine-tunes. \| 2026-06-27 12:34 GMT+8: post=neutral, author=neutral — They asked whether 3-bit quantization makes sense on current hardware or whether it effectively rounds to 4…
2	vulkan: make TP viable by pwilkin · Pull Request #25051 · ggml-org/llama.cpp	[Image: vulkan: make TP viable by pwilkin · Pull Request #25051 · ggml-org/llama.cpp] The legend Piotr has taken a pass at making Vulkan Tensor Parallel somewhat usable, really looking forward to seeing this evolve…	2026-06-27 04:57 GMT+8		/u/TKGaming_11	Community reaction (frontier/gpt-5.4-mini): Commenters are broadly interested in making Vulkan tensor parallelism usable and immediately ask for validation on non-NVIDIA hardware, with AMD users offering hardware and follow-up testing. The main practical disagreement is performance: one user posted RX 7900 XTX numbers for 2/4/8 GPUs showing Vulkan TP collapsing as GPU count rises while ROCm stays much stronger, and another notes Vulkan still copies through main memory unless device-group P2P is solved. The operator takeaway is that this looks promising as a path for multi-GPU support, but anyone deploying on AMD should benchmark scaling, P2P/memory-bandwidth behavior, and backend choice before assuming TP will improve throughput. Overall sentiment — post: mixed; author: positive. Reply threads: 2026-06-27 05:31 GMT+8: post=positive, author=neutral — They ask for test results from people on non-NVIDIA devices, signaling interest in the Vulkan TP work but… \| 2026-06-27 06:26 GMT+8: post=positive, author=positive — They offer AMD hardware to help support the work, which is a direct show of support for the patch and its… \| 2026-06-27 07:33 GMT+8: post=skeptical, author=neutral — They post benchmark tables for 2, 4, and 8 RX7900XTX cards showing Vulkan TP falling off sharply with more…

r/llmdevs

#	Post	Summary	Time	Score	Author	Community reaction
1	Context Recycling for Long-Horizon LLM Inference (ContextForge) + LLM Wiki + long-horizon benchmark results	I recently published a paper on long-horizon LLM memory and context management: https://arxiv.org/abs/2606.26105 (https://arxiv.org/abs/2606.26105) The core idea is treating the context window as a working set instead of memory. Each step rebuilds a minimal, relevant context instead of carrying forward the full…	2026-06-27 09:58 GMT+8		/u/betanu701
2	Need suggestion on tiny llm? on training data and parameters	Hey guys I want make a tiny LLM model only for grammar , reasoning and Math etc not big and after I want to train on my project data here is github link -> nanoGPT_75M…	2026-06-27 12:22 GMT+8		/u/Tu_Chutiya_Hai_69	Community reaction (frontier/gpt-5.4-mini): The only substantive reply pushes the author away from training a 75M model from scratch and toward fine-tuning an existing small model instead, arguing that this should yield much better grammar, reasoning, and math performance with far less compute. There is no real disagreement in the thread; the main caveat is practical rather than ideological, namely that compute efficiency and starting from a stronger base model matter more than building from zero for this use case. Overall sentiment — post: skeptical; author: neutral. Reply threads: 2026-06-27 13:02 GMT+8: post=skeptical, author=neutral — They advise fine-tuning an existing small model instead of training a 75M model from scratch because it…

r/OpenWebUI

#	Post	Summary	Time	Author	Community reaction
1	Disabling automatic transcription of uploaded audio files	Hi, I noticed that when I upload an audio file, Open WebUI starts transcribing it and makes file upload take longer. This slows things down and I’d prefer to disable that behavior.	2026-06-26 19:38 GMT+8	/u/Lazy_Secretary_3091
2	Qwen3:14b vs Gemma4:26b for tool calling — what’s your experience?	I’ve been digging into which model handles tool calling better for local agentic workflows (n8n, automations, multi-step agents) and the data is surprisingly hard to pin down. The short version of what I found: - Qwen3:14b scores F1 0.971 on Docker’s practical tool calling benchmark (3,570 real cases) — nearly…	2026-06-27 01:34 GMT+8	/u/RegicideRook	Community reaction (frontier/gpt-5.4-mini): Commenters converge on the idea that both Qwen and Gemma are strong for tool calling, but they repeatedly caveat that enabling thinking can introduce failures or odd behavior, so disabling thinking is a practical operator choice in at least one workflow. One user says Qwen3.5:9b is better than Gemma 4 at equal sizes, while another recommends Qwen3.5 35b A3B FP8 with thinking disabled for tool-call work and keeps Gemma 4 as a backup because Qwen can produce odd output at times. The main disagreement is partly semantic and version-related—one reply asks whether the post means Qwen3 or Qwen3.5—rather than a direct contradiction about tool-calling quality. Overall sentiment — post: mixed; author: neutral. Reply threads: 2026-06-27 02:37 GMT+8: post=positive, author=neutral — The commenter asks whether the comparison is against Qwen3 or Qwen3.5 and says they use Qwen3.5:9b for… \| 2026-06-27 03:03 GMT+8: post=neutral, author=neutral — This reply only indicates interest in the Qwen3.5 clarification and says they are going to look into it. \| 2026-06-27 05:13 GMT+8: post=positive, author=neutral — The commenter notes that the comparison is between a dense model and an MoE model, says both are excellent at…
3	Are there no built in Adaptive Memory functions?	I read an article (https://open-webui.com/open-webui-adaptive-memory/) that made it sound like OpenWebUI has built in adaptive memory but I cannot seem to find it. Instead, I found a plugin called Adaptive Memory v4 (https://openwebui.com/posts/adaptive_memory_v40_3fa072e0) that looks like made by third parties.	2026-06-26 00:06 GMT+8	/u/BigGunE	Community reaction (frontier/gpt-5.4-mini): Commenters split between treating OpenWebUI memory as a configuration issue and treating it as a missing or insufficient feature: one user says they disabled the built-in memory and moved to Honcho after a lot of work, while another says the built-in tools should let the AI write memory if the system prompt is nudged correctly. The practical takeaway is that memory behavior seems model- and setup-dependent, since one user reports no automatic writes even with Kimi 2.5, Gemini 3.5 Flash, Deepseek v4 Pro, and 4o mini, and another notes that OpenWebUI functions/plugins are Python-based, visible at install time, and can call either local or user-specified LLMs, which makes third-party memory handling a fallback for people worried about sensitive state. Overall sentiment — post: mixed; author: neutral. Reply threads: 2026-06-26 00:44 GMT+8: post=concerned, author=neutral — They say the built-in memory was only manual for them, so they disabled it and deployed Honcho instead after… \| 2026-06-26 00:46 GMT+8: post=neutral, author=neutral — They ask what ‘manual’ means and argue that the built-in tools should let the AI write memory. \| 2026-06-26 01:55 GMT+8: post=concerned, author=neutral — They say the built-in memory never wrote anything for them and only remembered what they entered manually,…
4	Gemma 4 12B refuses to work since turning function calling on!	I recently got help from you guys and managed to turn the memory features and function calling on. Now the problem is that Qwen can call functions but Gemma no longer responds back.	2026-06-27 06:07 GMT+8	/u/BigGunE	Community reaction (frontier/gpt-5.4-mini): Commenters mostly treat this as a configuration/debugging problem rather than a Gemma model failure: they ask whether the user is running gemma-4-12B versus gemma-4-12B-it, whether the GPU and context window are large enough, and whether too many tools are enabled. One commenter says it works fine through the terminal and had been working before function calling was turned on, which points the likely failure mode at the tool/function-calling setup rather than the base model itself. Practical takeaway for operators is to isolate by disabling tools one by one and verify hardware capacity, context size, and the exact model variant before chasing deeper bugs. Overall sentiment — post: neutral; author: neutral. Reply threads: 2026-06-27 08:30 GMT+8: post=neutral, author=neutral — They suggest checking GPU capability and whether the context window is too large, framing the issue as one of… \| 2026-06-27 08:39 GMT+8: post=neutral, author=neutral — They say it works fine through the terminal and was working before function calling was enabled, implying the… \| 2026-06-27 08:42 GMT+8: post=neutral, author=neutral — They advise turning off all activated tools and re-enabling them one by one to identify which tool is causing…
5	Looking for feedback on a GPU recommendation tool for self hosted AI.	I have been working on a small project that helps choose compatible GPUs for different models and compares pricing across cloud GPU providers. The goal is to remove all of the trial and error before deployment.	2026-06-26 10:54 GMT+8	/u/Major_Border149

r/selfhosted

#	Post	Summary	Time	Score	Author	Community reaction
1	Jcorp Nomad: A Self Hosted media server that fits in your pocket!	[Image: Jcorp Nomad: A Self Hosted media server that fits in your pocket!] Howdy folks! It’s been over a year since my first nomad post, I’ve been daily driving the thing for a few months now and it’s held up.	2026-06-27 05:22 GMT+8		/u/JcorpTech	Community reaction (frontier/gpt-5.4-mini): Commenters generally treat the pocket media server as a cool, genuinely useful self-hosted project and focus on practical limits rather than aesthetics: the main operator questions were concurrent streams, file format support, and how much the ESP32/storage stack can really handle. The concrete caveats were that throughput varies heavily with environmental factors, SD card brand, and buffer behavior; the author says about 8 simultaneous streams was the most they managed before buffering became unmanageable, video is best kept around 480p even though 1080p can work, and files over 4GB become a problem enough that exFAT support is desirable but awkward with current library/IDE constraints. Overall sentiment — post: positive; author: positive. Reply threads: 2026-06-27 05:37 GMT+8: post=positive, author=positive — They call the project super cool and ask a practical capacity question about how well it handles multiple… \| 2026-06-27 05:43 GMT+8: post=neutral, author=neutral — They explain that music is very low cost, video is best kept as low as 480p though 1080p technically works,… \| 2026-06-27 06:14 GMT+8: post=positive, author=neutral — They suggest adding exFAT support because it is readable on Linux, macOS, and Windows, supports very large…
2	What self-hosted apps do you actually use every day?	Hey there, Just curious, what self-hosted apps are you actually using daily? I keep seeing people recommend new stuff here, and that’s actually how I found Glance dashboard.	2026-06-27 00:09 GMT+8		/u/No-Card-2312	Community reaction (frontier/gpt-5.4-mini): Commenters mostly used the thread to name their daily self-hosted stack — Jellyfin, WireGuard, Navidrome, Ampcast, Miniflux, Siyuan, and Yamtrack — and the only app that got deeper discussion was Siyuan, which one user described as basically “Obsidian but no workarounds” with keyboard shortcuts, slash commands like /link, and a simple host-and-web-UI setup. The main caveat was mobile access: one user could not get the iOS app to connect to a self-hosted instance, and the workaround offered was using the PWA from the phone’s home screen, which supports photo attachments but is still “not as polished” and has somewhat icky onboarding. Overall sentiment — post: positive; author: neutral. Reply threads: 2026-06-27 00:13 GMT+8: post=positive, author=neutral — They shared a concrete daily self-hosted stack: Jellyfin, WireGuard, Navidrome, Ampcast, Miniflux, Siyuan,… \| 2026-06-27 03:43 GMT+8: post=positive, author=neutral — They said Siyuan looked really nice and asked whether the other user actually liked it enough to use daily. \| 2026-06-27 03:48 GMT+8: post=positive, author=neutral — They said Siyuan works well for basic notes, supports keyboard shortcuts and slash commands like /link, runs…

r/ClaudeAI

#	Post	Summary	Time	Score	Author	Community reaction
1	I used Claude to fix my biggest frustration with PDFs	[Image: I used Claude to fix my biggest frustration with PDFs] My bank asked for 17 PDF documents regarding my mortgage application. I got tired of opening and closing files and keeping track of things.	2026-06-27 05:05 GMT+8		/u/gounisalex	Community reaction (frontier/gpt-5.4-mini): Most commenters liked the PDF workflow idea and treated it as something that should be standard, with repeated asks for editing, signing, annotations, and a Windows build; OP said those features are being worked on and that the app is Electron-based and intended to be cross-platform. The main caveat came from Mac users noting that Preview already handles opening multiple PDFs, combining them, annotating them, and stamping signatures for free, though certificate-based digital signatures and the app’s free-floating 2D canvas for side-by-side comparison were still seen as differentiators. One commenter also worried that Electron could mean Chrome-like RAM usage, so the practical takeaway is that enthusiasm is high but people are watching feature completeness and resource footprint. Overall sentiment — post: positive; author: positive. Reply threads: 2026-06-27 05:41 GMT+8: post=positive, author=positive — They said the idea is so obvious it should be standard, asked about editing, signing, and markup features,… \| 2026-06-27 06:09 GMT+8: post=mixed, author=neutral — They pointed out that macOS Preview already lets them open multiple PDFs, combine them, annotate them, and… \| 2026-06-27 12:26 GMT+8: post=skeptical, author=neutral — They reacted to the Electron stack choice by worrying it would use Chrome-like amounts of RAM.
2	Trump admin allows Anthropic to release Mythos AI model to some companies, government agencies: Reports	Source: https://www.cnbc.com/2026/06/26/us-government-anthropic-claude-mythos5-ai.html?__source=iosappshare%7Ccom.atebits.Tweetie2.ShareExtension…	2026-06-27 07:01 GMT+8		/u/thelastsubject123	Community reaction (frontier/gpt-5.4-mini): The thread’s main consensus is that “trusted partners” is undefined enough to look arbitrary, with multiple commenters interpreting access to Mythos as something large companies get through donations, lobbying, or plain favoritism rather than a transparent eligibility process. The few concrete operator takeaways in the comments are that access may be limited to incumbents like Cisco, Palo Alto, Fortinet, and Microsoft, and one commenter claims the Project Glasswing companies already had Mythos for months, which would make this less of a new release than a staged rollout; there is little disagreement on the corruption suspicion, but one question remains whether this is a temporary preview or the new normal for frontier-model distribution. Overall sentiment — post: critical; author: neutral. Reply threads: 2026-06-27 07:16 GMT+8: post=neutral, author=neutral — This commenter asks for specifics on what “trusted partners” means, who qualifies, what Mythos can be used… \| 2026-06-27 07:47 GMT+8: post=critical, author=neutral — This commenter says the arrangement is likely another revenue stream for Trump and his associates and frames… \| 2026-06-27 07:29 GMT+8: post=neutral, author=neutral — This commenter speculates that the approved companies are probably large vendors such as Cisco, Palo Alto,…

r/ClaudeCode

#	Post	Summary	Time	Score	Author	Community reaction
1	Sonnet 4.6 is incredibly fast and now more realiabel!	I was really frustrated with the slowness and debugging looping of Opus 4.8. Recently, I decided to try out Sonnet 4.6, and I was amazed by its performance.	2026-06-27 05:45 GMT+8		/u/Wonderful-Ad-5952	Community reaction (frontier/gpt-5.4-mini): Commenters mostly converged on a practical Sonnet-main plus Opus-as-advisor workflow: one user says `/advisor` can be set once to call Opus when Sonnet 4.6 gets stuck or at the end of a longer thought cycle, and even quotes Anthropic’s recommendation that this can yield near-Opus performance with reduced token usage. The main caveats are operational rather than ideological: people want clearer instructions and links, one user says switching to a new session loses ongoing context, and another notes the web version of Claude Code does not report slash commands correctly, pushing the discussion toward CLI usage and handoff markdowns. Overall sentiment — post: mixed; author: positive. Reply threads: 2026-06-27 05:47 GMT+8: post=positive, author=positive — The commenter suggests keeping Opus 4.8 available as an advisor, which supports the post’s premise that… \| 2026-06-27 05:49 GMT+8: post=positive, author=positive — The commenter explains that `/advisor` is configured once and invoked when Sonnet 4.6 gets stuck or finishes… \| 2026-06-27 05:52 GMT+8: post=neutral, author=positive — The commenter asks for more instructions or a link because the `/advisor` setup is not clear, showing…
2	Terminal or IDE for Claude Code — which do you actually use, and why?	genuinely trying to settle this for myself. i run claude code in an IDE and keep seeing people swear by the raw terminal, not just put up with it but actually prefer it.	2026-06-27 11:16 GMT+8		/u/Lonely_Ostrich9801	Community reaction (frontier/gpt-5.4-mini): Most commenters say they use Claude Code in the terminal/CLI while keeping an IDE open for reviewing edits or running code, and one says IDE use is shrinking further because they mostly inspect commits or PRs on GitHub. The strongest pro-terminal take argues that VSCode’s terminal is the “best of both worlds” because the IDE plugin panel feels bloated, the CLI handles multi-repo orchestration, session resume, /remote-control, and automation, but one commenter asks how non-interactive runs avoid API fees and another replies that their Max plan covers everything. Overall sentiment — post: positive; author: positive. Reply threads: 2026-06-27 11:55 GMT+8: post=positive, author=positive — They strongly prefer Claude Code in the VSCode terminal, saying it avoids a bloated IDE plugin panel, works… \| 2026-06-27 12:29 GMT+8: post=concerned, author=neutral — They ask how the non-interactive usage avoids API fees, saying they thought the loopholes were closed and… \| 2026-06-27 12:31 GMT+8: post=positive, author=positive — They say their Max plan runs everything, including client work, code, email, and scheduled tasks, and never…

r/Codex

#	Post	Summary	Time	Score	Author	Community reaction
1	GPT 5.6 “sol” announced	it’s apperantly better than mythos 5 by 10% https://openai.com/index/previewing-gpt-5-6-sol/ (https://openai.com/index/previewing-gpt-5-6-sol/) submitted by -Kick7291 (https://www.reddit.com/user/Prestigious-Kick7291)…	2026-06-27 01:07 GMT+8		/u/Prestigious-Kick7291	Community reaction (frontier/gpt-5.4-mini): Commenters mostly focused on access and rollout risk rather than the claimed model improvement: several expect a limited preview, ID gating, or other restrictions before broad availability, while one commenter thinks everyone will get it in a few weeks and another says outside competition could force OpenAI to relax restrictions. There is also skepticism about quality retention, with commenters predicting the model will be “dummer” after launch or that the first days may be the only “pristine” version before inference is sacrificed for training. Practical takeaway for operators: expect staged access, possible policy friction, and uncertain post-launch fidelity even if the announcement sounds strong on paper. Overall sentiment — post: concerned; author: neutral. Reply threads: 2026-06-27 01:09 GMT+8: post=concerned, author=neutral — They say they are no longer excited because they expect either restrictions or an ID process to prevent them… \| 2026-06-27 01:41 GMT+8: post=concerned, author=neutral — They quote the preview language about a limited rollout to trusted partners shared with the government,… \| 2026-06-27 04:02 GMT+8: post=neutral, author=neutral — They believe the model will reach everyone in a few weeks, framing the delay as annoying but not as a…
2	GPT-5.6 Officially Previewed: Beats Mythos 5.	[Image: GPT-5.6 Officially Previewed: Beats Mythos 5.] Source : https://x.com/pankajkumar_dev/status/2070563550301769753 (https://x.com/pankajkumar_dev/status/2070563550301769753?s=20) GPT-5.6 Pricing: - Sol: $5 / $30 -…	2026-06-27 02:05 GMT+8		/u/Much_Ask3471	Community reaction (frontier/gpt-5.4-mini): Commenters mostly fixated on two practical issues: the rumored pricing/throughput for Luna looks compelling enough that one person said they’d just use it if it really hits 650tk/s, but others immediately cautioned that price means little without quality data, comparing it to Gemini Flash Lite being cheaper yet “pretty useless.” A separate thread was frustrated by model churn in Codex, with users lamenting that 5.3 Codex was removed, hoping mini or even nano would be available because 5.4 mini “isn’t good enough,” and speculating that 5.4 may also disappear as resources tighten. The only clear attack on the post itself was skepticism that the author is “scrap[ing] together speculation” and self-linking, so the practical operator takeaway is to treat the preview as tentative until strength, availability, and retention of older models are confirmed. Overall sentiment — post: mixed; author: skeptical. Reply threads: 2026-06-27 03:19 GMT+8: post=critical, author=critical — They accuse the post of being built from speculation and self-referential tweets rather than solid reporting. \| 2026-06-27 02:07 GMT+8: post=positive, author=neutral — They hope mini or even nano will be available on Codex because limits are dropping and 5.4 mini is not good… \| 2026-06-27 02:35 GMT+8: post=positive, author=neutral — They think the biggest story is the rumored 650tk/s throughput and say they would just use the model if that…

Generated 2026-06-27 13:20 GMT+8 | Next update in 2 hours