Technology Watch - AI and Cybersecurity Insights

Summary

Audio Summmary

On safety issues, Google DeepMind is allocating 10 million USD to a funding program for research into safety for multi-agent systems. Their concern is that multi-agent systems can exhibit a collective behavior that is hard to predict by observing individual agents in isolation. For some researchers, if artificial general intelligence is possible at all, it might not come from a powerful single model but from an “agent hive mind” where the collective participation of agents creates the overall intelligence. Meanwhile, Devin Kim, now president of the non-profit Center for AI Safety, has filed a lawsuit against xAI claiming that he was fired from that company for raising concerns about AI risks. xAI has faced several lawsuits and investigations this year as its Grok AI chatbot was used to generated millions of AI-enhanced sexualized images. The Center for Countering Digital Hate says many of the images were created from photos of women without their consent.

The long-awaited SpaceX IPO has taken place and propelled the company’s valuation to over 2.7 trillion USD, making it the fifth-most valuable company in the world. SpaceX acquired xAI in February of this year. Elon Musk holds most of the SpaceX shares and retains majority voting power. The IPO also makes him the world’s first trillionaire. Meanwhile, Prometheus, a physical AI startup co-founded by Jeff Bezos, has just raised 12 billion USD in a second funding round with a valuation of 41 billion USD. Physical AI refers to artificial intelligence that can perceive, reason about, and act in the physical world, and covers robots, autonomous vehicles, drones, and Internet of Things systems. Prometheus wants to build an “artificial general engineer”, which is a physical AI system capable of manufacturing physical goods on assembly lines. For Bezos, this objective is required to address a “labor scarcity” since demand for workers will soon exceed human worker availability.

Subquadratic, a Miami-based AI startup, recently announced a breakthrough that hugely reduces the costs of running large language models, though some scientists are skeptical. Subquadratic has created a language model called SubQ which the company claims is able to process 12 times as much text as other contemporary models. The SubQ model has not been released yet, except to a select few. One issue raised is that the model reused weights from the Chinese open-source model Qwen instead of being trained from scratch. This could impact the significance of the claims made by Subquadratic. Elsewhere, an InfoWorld article presents a detailed list of key metrics for classifying large language models. One noteworthy aspect is the proliferation of agentic AI related metrics such as token efficiency, tool call accuracy and instruction following.

On society issues, Amazon announced this week that it is now “water-positive” in India. This means that it returns more water to communities than it used for operations, including data centers, offices and warehouses. Water availability is a particular challenge in India. The country has 18% of the world’s population but only 4% of the world’s freshwater resources. Meanwhile, the Guardian reported that data collected from the Pokémon Go app could soon be used to potentially help drones find targets in conflicts. The company who created Pokémon used the collected data to train AI models to recognize objects in the physical world. The company is now working on a deal worth 217 million USD with the US Army for training software. Drones typically use GPS to orientate themselves. However, when GPS signals are jammed, spoofed or interfered with, drones can get lost. The military is looking for supplementary ways for drones to recognize their locations – and Pokémon game data is an option.

On geopolitical issues, the US administration has ordered Anthropic to stop exporting its powerful Fable and Mythos models, or releasing them to non-US nationals. There is ongoing debate within the US administration about this type of ban. It is possible that China will catch up on US AI capabilities with or without the ban, so removing the ban would at least help US companies remain competitive abroad. Elsewhere, Microsoft is now the principal supplier of OpenAI models in China and the Chinese company ByteDance is expected to spend one billion USD on Microsoft AI this year. Microsoft is allowed to sell GPT models abroad under the terms of an agreement it made with OpenAI. Both OpenAI and Anthropic are refusing to sell AI models on the Chinese market. Several US observers have expressed concern that Chinese companies are distilling US models to develop their own – this is where a (student) model is created by learning directly from a teacher model, instead of being created from training data.

1. Google DeepMind is worried about what happens when millions of agents start to interact

2. Jeff Bezos’s Prometheus raises $12B to build an ‘artificial general engineer’ for the physical world

3. Musk’s xAI fired engineer for raising concerns about Grok chatbot, lawsuit claims

4. Pokémon Go data trained AI that could assist military drones in war zones

5. SpaceX is public: Everything you need to know post-IPO

6. 33 LLM metrics to watch closely

7. A startup claims it broke through a bottleneck that’s holding back LLMs

8. Microsoft sells OpenAI models in China. OpenAI and Anthropic won’t.

9. Amazon points to water conservation steps in India amid data centre scrutiny

10. From PGP to Mythos: a brief history of export controls that didn’t stop anyone

1. Google DeepMind is worried about what happens when millions of agents start to interact

Google DeepMind is allocating 10 million USD to a funding program for research into safety for multi-agent systems.

The concern is that multi-agent system exhibit a collective behavior that is hard to predict by observing individual agents in isolation. For Google’s AGI safety and alignment head, “We see this with humanity, too… Our institutions can accomplish things that no individual human can.”.
The main risks envisaged are more powerful versions of the bad behavior already present on the Internet like scams and other cyberattacks.
For some researchers at Google DeepMind, if artificial general intelligence is possible at all, it might not come from a powerful single model but from an “agent hive mind” where the collective participation of agents creates the overall intelligence.
The challenge for agent cybersecurity is that whereas, previously, software was created by writing specific institutions, agents are just given goals and decide for themselves what to do to achieve those goals. We previously could know what software was doing, but “An agent breaks all of those assumptions. It reasons, it improvises, and it can be hijacked by a single sentence buried in a document it was asked to read.”.

Source MIT Technology Review

2. Jeff Bezos’s Prometheus raises $12B to build an ‘artificial general engineer’ for the physical world

Prometheus, a physical AI startup co-founded by Jeff Bezos, has just raised 12 billion USD in a second funding round with a valuation of 41 billion USD.

Physical AI refers to artificial intelligence that can perceive, reason about, and act in the physical world, and covers robots, autonomous vehicles, drones, and Internet of Things systems.
The goal of Prometheus is to build what it calls an “artificial general engineer”, which is a physical AI system capable of manufacturing physical goods on assembly lines. For Bezos, this objective is required to address a “labor scarcity” since demand for workers will soon exceed human worker availability.
While many fear that physical AI will lead to increased unemployment, Bezos says that “significant productivity in the economy is going to raise the standard of living”.
Physical AI is becoming increasingly attractive to investors because, compared to software AI, physical machines are natural moats for companies to create competitive advantage.

Source TechCrunch

3. Musk’s xAI fired engineer for raising concerns about Grok chatbot, lawsuit claims

Devin Kim, now president of the non-profit Center for AI Safety, has filed a lawsuit against xAI claiming that he was fired from that company for raising concerns about AI risks.

The lawsuit writes: “Mr Kim repeatedly complained that xAI’s failure to prioritize AI safety, particularly with respect to Grok, virtually guaranteed that the Company would commit unlawful acts, from fomenting discrimination to proliferating weapons of mass destruction.”.
xAI has faced several lawsuits and investigations this year as its Grok AI chatbot was used to generated millions of AI-enhanced sexualized images. The Center for Countering Digital Hate says many of the images were created from photos of women without their consent.
The center also says that 23000 sexualized images of children were created during an 11-day period between December and January this year.

Source The Guardian

4. Pokémon Go data trained AI that could assist military drones in war zones

This Guardian article describes how data collected from the Pokémon Go app will be used to potentially help drones find targets in conflicts.

Pokémon Go is an augmented reality mobile phone game from around 2016 where users find virtual creatures with the help of their cameras. The app had 800 million downloads in 2018.
An update to the game in 2021 gave users game points for scanning real locations using their device cameras and uploading the recording.
Niantic, the company who created Pokémon, used the collected data to train AI models to recognize objects in the physical world. Niantic is now partnering with Vantor – a company specializing in spatial detection software for drones. Vantor announced a deal worth 217 million USD with the US Army for training software.
Drones typically use GPS to orientate themselves. However, when GPS signals are jammed, spoofed or interfered with, drones can get lost. The military is looking for supplementary ways for drones to recognize their locations.

Source The Guardian

5. SpaceX is public: Everything you need to know post-IPO

The long-awaited SpaceX IPO has taken place and propelled the company’s valuation to over 2.7 trillion USD, making it the fifth-most valuable company in the world.

SpaceX’s value proposition is reusable rocket launches and the Starlink satellite network. SpaceX acquired xAI in February of this year.
The company priced its 555.6 million shares at 135 USD each to to raise 75 billion USD. The share price soared to over 200 USD and is around 185 USD on the date of this publication.
The New York Times estimates that as many as 4400 employees could become millionaires thanks to the IPO.
SpaceX now intends to acquire Cursor AI for 60 billion USD in stock. The SpaceX COO has hinted at a merger of SpaceX and Tesla.
Elon Musk holds most of the SpaceX shares and retains majority voting power. The IPO also makes him the world’s first trillionaire.

Source TechCrunch

6. 33 LLM metrics to watch closely

This InfoWorld article presents a detailed list of the key metrics currently used to classify large language models. One noteworthy aspect is the proliferation of agentic AI related metrics.

One class of metric relates to speed. These include: Time to first token (time needed to create first token can be critical in real-time applications), Time per output token (average model speed for each output token), Tokens per second, Throughput, Token efficiency (how much work is done to produce the final result in agent applications) and Tail latency (uses queueing theory to estimate time needed for slowest 99% of requests).
Another class relates to safety. These include Hallucination Rate, Toxicity and bias scores, PII leakage, Jailbreak resistance, Prompt injection vulnerability, and Copyright infringement scores.
The accuracy class includes the metrics for Tool calling accuracy (how often the model chooses the best external tool in agent scenarios), Prompt sensitivity, Grounding score (when model presented with specific documents, like for RAG, how these are used by model), Model variability (how outputs differ between model runs for same prompt), Format compliance rate to standard document types like CSV and JSON, and Instruction following (how well specific instructions in prompts are adhered to).
The article also mentions the current standard LLM benchmarks. These include GSM8K (Grade School Math 8K benchmark with 8500 problems from grade school math classes), GPQA (Graduate-Level Google-Proof Q&A is composed of hundreds of hard questions from graduate school), MMLU-Pro (Massive Multitask Language Understanding dataset to test a model’s understanding of a broad set of scientific knowledge, with 12000 questions on biology, chemistry, economics, and law), MBPP (Google’s Mostly Basic Python Problems), SWE-bench (containing several thousand software engineering challenges to see how well a model solves programming problems).
Finally, the LMSYS Chatbot Arena Organization’s Chatbot Arena is a system that gives the same prompt to different models and then asks humans to pick the best results.

Source InfoWorld

7. A startup claims it broke through a bottleneck that’s holding back LLMs

Subquadratic, a Miami-based AI startup, recently announced a breakthrough that hugely reduces the costs of running large language models, but some scientists are skeptical.

A typical language model uses a type of neural network called a transformer. A GPT model could have hundreds of transformers linked together. When processing a text, a transformer associates each word with a number. Processing the text during inference involves multiplying each number with every other one. Thus, a text with 10’000 words requires 50 million individual multiplications. Further, increasing the number of words leads to quadratic increase in multiplications. The underlying process is known as dense attention, and is the main reason why language models are so expensive.
Subquadratic has claimed to have discovered a solution for sparse-attention. This approach does not require all numbers to be multiplied with all others; only a selected subset of numbers need to be multiplied. Informally, this corresponds to fact that not all words in a text are related to each other.
Subquadratic has created a language model called SubQ which the company claims is able to process 12 times as much text as other contemporary models. The model scored 89.7% on the LiveCodeBench coding benchmark. On the RULER 129 test which measures a model’s ability to locate information from a large data set, the model run cost only 8 USD, compared to 2600 USD for Anthropic’s Opus 4.6.
The model’s context window (working memory) is 12 million tokens – compared to 1 million for contemporary models.
The SubQ model has not been released yet, except to a select few. One issue raised is that the model reused weights from the Chinese open-source model Qwen instead of being trained from scratch. This could impact the significance of the claims made by Subquadratic.

Source MIT Technology Review

8. Microsoft sells OpenAI models in China. OpenAI and Anthropic won’t.

Microsoft is now the principal supplier of OpenAI models in China. The Chinese company ByteDance is expected to spend one billion USD on Microsoft AI this year.

Microsoft is allowed to sell GPT models abroad under the terms of an agreement it made with OpenAI.
Microsoft is also hosting Chinese models like DeepSeek-V4. The company is managing to sell Chinese models in the US and US models in China. President Brad Smith has said that the Chinese market accounted for 1.5% of Microsoft’s revenue in 2024.
Both OpenAI and Anthropic are refusing to sell AI models on the Chinese market.
Several US observers have expressed concern that Chinese companies are distilling US models to develop their own – this is where a (student) model is created by learning directly from a teacher model, instead of being created from training data.
Microsoft claims to limit its exposure to the Chinese market by not hosting any models on Chinese soil. The AI runs on data centers outside like in Singapore.
Politicians in Washington are increasingly hostile to US companies selling AI models in China. It is not improbable that Microsoft will be asked to stop making the OpenAI models available there.

Source Artificial Intelligence News

9. Amazon points to water conservation steps in India amid data centre scrutiny

Amazon announced this week that it is now “water-positive” in India. This means that it returns more water to communities than it used for operations, including data centers, offices and warehouses.

The company says that it has achieved this landmark through reduced water use in its facilities, efficient irrigation and watershed restoration (projects that repair and maintain land areas so that all water drains to a common body of water like a lake or river).
The company has set an objective of 2030 to be water-positive globally.
Water availability is a particular challenge in India. The country has 18% of the world’s population but only 4% of the world’s freshwater resources. Further, a strong El Nino led to weak monsoon rains this year so water shortages are expected. In Karnataka which has a population of 13 million people and is home to many Tech companies, authorities believe that the state has only 40 days of water left.

Source Reuters

10. From PGP to Mythos: a brief history of export controls that didn’t stop anyone

This TechCrunch article analyzes a ban just imposed on Anthropic by the US administration which forbids the company exporting its powerful Fable and Mythos models, or releasing them to non-US nationals.

Currently, only companies vetted by Anthropic have had access to the models. However, Anthropic gave access to a South Korean telecom company which the White House believes may have links to China. This was one reason for the ban being announed.
The article compares the ban to that imposed on encryption products in the 1990s. At that time, the US government wanted to be able to eavesdrop on Internet communications so forbade the export of encryption technologies. The popular Pretty Good Privacy (PGP) encryption software was the target of a ban and the US Customs Service opened a criminal investigation against PGP creator Phil Zimmermann for violating arms exports controls.
Zimmerman published the source code of PGP in a printed book, effectively making the ban ineffective. Today, messaging apps like Signal and WhatsApp use end-to-end encryption.
Another case of software control relates to spyware. In the early 2010s, it was discovered that Western companies were developing spyware that was used against dissidents in the Middle East. This led to several governments signing the Wassenaar Arrangement to control the export of such software. However, many EU countries have been lenient in its implementation, and spyware exports continue.
There is a key debate within the US administration about the ban. It is possible that China will catch up on US AI capabilities with or without the ban, so removing the ban would at least help US companies remain competitive abroad.

Source TechCrunch

Google DeepMind Worried About What Happens When Millions of Agents Interact

SpaceX Goes Public

Summary

Audio Summmary

Table of Contents

1. Google DeepMind is worried about what happens when millions of agents start to interact

2. Jeff Bezos’s Prometheus raises $12B to build an ‘artificial general engineer’ for the physical world

3. Musk’s xAI fired engineer for raising concerns about Grok chatbot, lawsuit claims

4. Pokémon Go data trained AI that could assist military drones in war zones

5. SpaceX is public: Everything you need to know post-IPO

6. 33 LLM metrics to watch closely

7. A startup claims it broke through a bottleneck that’s holding back LLMs

8. Microsoft sells OpenAI models in China. OpenAI and Anthropic won’t.

9. Amazon points to water conservation steps in India amid data centre scrutiny

10. From PGP to Mythos: a brief history of export controls that didn’t stop anyone