Anthropic dares you to jailbreak its new AI model

Week-long public test follows 3,000+ hours of unsuccessful bug bounty claim attempts.

Even the most permissive corporate AI models have sensitive topics that their creators would prefer they not discuss (e.g., weapons of mass destruction, illegal activities, or, uh, Chinese political history). Over the years, enterprising AI users have resorted to everything from weird text strings to ASCII art to stories about dead grandmas in order to jailbreak those models into giving the "forbidden" results.

Today, Claude model maker Anthropic has released a new system of Constitutional Classifiers that it says can "filter the overwhelming majority" of those kinds of jailbreaks. And now that the system has held up to over 3,000 hours of bug bounty attacks, Anthropic is inviting the wider public to test out the system to see if it can fool it into breaking its own rules.

Respect the constitution

In a new paper and accompanying blog post, Anthropic says its new Constitutional Classifier system is spun off from the similar Constitutional AI system that was used to build its Claude model. The system relies at its core on a "constitution" of natural language rules defining broad categories of permitted (e.g., listing common medications) and disallowed (e.g., acquiring restricted chemicals) content for the model.

Read full article

Comments

“Zero warnings”: Longtime YouTuber rails against unexplained channel removal

Developer calls for human review to end YouTube’s automated channel removals.

Artemiy Pavlov, the founder of a small but mighty music software brand called Sinesvibes, spent more than 15 years building a YouTube channel with all original content to promote his business' products. Over all those years, he never had any issues with YouTube's automated content removal system—until Monday, when YouTube, without issuing a single warning, abruptly deleted his entire channel.

"What a 'nice' way to start a week!" Pavlov posted on Bluesky. "Our channel on YouTube has been deleted due to 'spam and deceptive policies.' Which is the biggest WTF moment in our brand's history on social platforms. We have only posted demos of our own original products, never anything else...."

Officially, YouTube told Pavlov that his channel violated YouTube's "spam, deceptive practices, and scam policy," but Pavlov could think of no videos that might be labeled as violative.

Read full article

Comments

Publishers Ramp Up Pressure vs. Anna’s Archive, Sci-Hub, Z-Library & Libgen

The world’s major publishers claim that unlicensed libraries cast a permanent shadow over authors’ ability to make a living from their work. In common with the movie and movie industry counterparts, site-blocking is one of the weapons of choice, albeit against well-prepared opponents.

From: TF, for the latest news on copyright battles, piracy and more.

In a world where many things seem vulnerable to change at a moment’s notice, the same world viewed from a more distant vantage point hardly seems to change at all.

Whether for recreation or education, demand for published content in various formats continues to thrive. Yet a closer view reveals bricks and mortar book stores and traditional libraries in decline, and licensed digital libraries invisibly replacing both online. Not from a position of safety, however.

The world’s major publishers claim that unlicensed libraries cast a permanent shadow over authors’ ability to make a living from their work. Those same shadows also make it more difficult to predict whether today’s investments in publishing content will pay off, or find themselves copied at will and distributed for free on the world’s most popular shadow libraries.

Difficult or Impossible to Stop

Stopping these sites has proven impossible, at least to date. Relative newcomer Anna’s Archive faces the usual pressures, but thus far hasn’t been tested under the existential crisis conditions previously weathered by its infamous counterparts.

Z-Library, Sci-Hub, and Libgen have consistently emerged relatively unscathed from lawsuits and numerous enforcement measures, despite what should’ve been insurmountable odds.

Libgen looked most precarious recently; its eventual demise may not have been swift, but with no new content aboard a captainless ship adrift, the risk of being invisibly replaced itself seemed increasingly likely. At least until unexpected repairs saved the day.

Site Blocking From the Shadows

In common with the movie and music industries, trade groups in the publishing sector view site-blocking as a useful tool in the broader fight against piracy. In the UK, the Publishers Association represents the interests of publishing companies both large and small. It also supplies a range of anti-piracy services, from conducting research and sharing insights, to the removal of content from various online services.

A significant component of the association’s anti-piracy work receives no mention on its official Content Protection and Enforcement page. Yet behind the scenes, the Publishers Association uses authority obtained at the High Court to compel the UK’s largest ISPs to block access to the shadow libraries mentioned above.

Publishers Elsevier and Springer Nature also engage in site-blocking in the UK. Since last November, all publishers appear to have stepped up their blocking efforts, at least in part due to a series of blocking circumvention measures deployed by Sci-Hub and Libgen, but by volume those attributable to Anna’s Archive especially.

Impossible to Shut Down, But Perhaps More Difficult to Find

In mid-November, Elsevier and Springer Nature identified several domains that facilitate access to Sci-Hub. Among them pismin.com, which immediately directs to a more recognizable domain, sci-hub.se. At the top of that page, visitors are advised of other domains to use (sci-hub.st and sci-hub.ru) in the event that sci-hub.se becomes inaccessible.

Elsevier & Springer Nature already have those domains covered. The list from November covers pismin.com, plus domains and subdomains including ac.cn.sci-hub.ru / ac.ru.sci-hub.ru, pubs.deutsche.orgs.sci-hub.se, and the initially confusing, sci-hub.st.sci-hub.se.sci-hub.st. An update in December added more of the same, including free.read.sci-hub.se.sci-hub.st and pubs.francais1.orgs.sci-hub.se; both likely crafted as blocking countermeasures but now blocked themselves, just like the others.

With responsibility for blocking Anna’s Archive, Z-Library and Libgen, the Publishers Association had a significantly busier period of blocking during November, December, and January. The deployment of dozens of country-specific subdomains under annas-archive.org appears to have been swiftly handled by the Publishers Association, as shown in the small sample below.

annas-subdomains

A new wave of Z-Library domains/subdomains, including z-library.sk, it.1lib.sk, es.1lib.sk, 1lib.sk, z-lib.gs, z-lib.fm, z-lib.gl, it.z-lib.gd, en.z-lib.gs, and id.z-lib.gs represents just a small sample from an unusually large list.

Easily recognizable Libgen-related domains are numerous too, mostly falling into two categories. The first group consists of straightforward main domains, including libgenesis.net, libgen.mx and library.bz. The second are instantly identifiable as proxy service domains, such as libgen.unblockninja.com, libgen.proxyninja.org, libgen.dirproxy.info, and libgen.pproxy.org.

In most cases URLs with this appearance facilitate access to Libgen, but are commonly operated by third parties as part of a general unblocking service.

While yet to be confirmed, there are signs that the subdomain whac-a-mole may not continue forever. Suggestions that new blocklist entries may be wildcard-enabled would eliminate subdomain countermeasures, while introducing a new requirement for additional domain purchases, potentially in very large numbers.

For some sites, that might amount to an irritant. For those yet to automate such tasks while also on the advertising revenue brink, it might even prove terminal.

Not for the sites mentioned here necessarily (despite mounting bills), but others perhaps, and there’s no shortage of supply.

From: TF, for the latest news on copyright battles, piracy and more.

Let us spray: River dolphins launch pee streams into air

It’s unclear why river dolphins do this, but it might serve some kind of social function.

According to Amazonian folklore, the area's male river dolphins are shapeshifters (encantade), transforming at night into handsome young men who seduce and impregnate human women. The legend's origins may lie in the fact that dolphins have rather human-like genitalia. A group of Canadian biologists didn't spot any suspicious shapeshifting behavior over the four years they spent monitoring a dolphin population in central Brazil, but they did document 36 cases of another human-like behavior: what appears to be some sort of cetacean pissing contest.

Specifically, the male dolphins rolled over onto their backs, displayed their male members, and launched a stream of urine as high as 3 feet into the air. This usually occurred when other males were around, who seemed fascinated in turn by the arching streams of pee, even chasing after them with their snouts. It's possibly a form of chemical sensory communication and not merely a need to relieve themselves, according to the biologists, who described their findings in a paper published in the journal Behavioral Processes. As co-author Claryana Araújo-Wang of CetAsia Research Group in Ontario, Canada, told New Scientist, “We were really shocked, as it was something we had never seen before.”

Spraying urine is a common behavior in many animal species, used to mark territory, defend against predators, communicate with other members of one's species, or as a means of mate selection since it has been suggested that the chemicals in the urine carry useful information about physical health or social dominance.

Read full article

Comments

Let us spray: River dolphins launch pee streams into air

It’s unclear why river dolphins do this, but it might serve some kind of social function.

According to Amazonian folklore, the area's male river dolphins are shapeshifters (encantade), transforming at night into handsome young men who seduce and impregnate human women. The legend's origins may lie in the fact that dolphins have rather human-like genitalia. A group of Canadian biologists didn't spot any suspicious shapeshifting behavior over the four years they spent monitoring a dolphin population in central Brazil, but they did document 36 cases of another human-like behavior: what appears to be some sort of cetacean pissing contest.

Specifically, the male dolphins rolled over onto their backs, displayed their male members, and launched a stream of urine as high as 3 feet into the air. This usually occurred when other males were around, who seemed fascinated in turn by the arching streams of pee, even chasing after them with their snouts. It's possibly a form of chemical sensory communication and not merely a need to relieve themselves, according to the biologists, who described their findings in a paper published in the journal Behavioral Processes. As co-author Claryana Araújo-Wang of CetAsia Research Group in Ontario, Canada, told New Scientist, “We were really shocked, as it was something we had never seen before.”

Spraying urine is a common behavior in many animal species, used to mark territory, defend against predators, communicate with other members of one's species, or as a means of mate selection since it has been suggested that the chemicals in the urine carry useful information about physical health or social dominance.

Read full article

Comments

Tariffs may soon spike cost of cars, household goods, consumer tech

“A little pain”: Trump finally admits tariffs heap costs on Americans.

Over the weekend, President Trump issued executive orders heaping significant additional tariffs on America's biggest trading partners, Canada, China, and Mexico.

To justify the tariffs—"a 25 percent additional tariff on imports from Canada and Mexico and a 10 percent additional tariff on imports from China"—Trump claimed that all partners were allowing drugs and immigrants to illegally enter the US. Declaring a national emergency under the International Emergency Economic Powers Act, Trump's orders seemed bent on "downplaying" the potential economic impact on Americans, AP News reported.

But very quickly, the trade policy sparked inflation fears, with industry associations representing major US firms from many sectors warning of potentially derailed supply chains and spiked consumer costs of cars, groceries, consumer technology, and more. Perhaps the biggest pain will be felt by car buyers already frustrated by high prices if car prices go up by $3,000, as Bloomberg reported. And as Trump eyes expanding tariffs to the European Union next, January research from the Consumer Technology Association showed that imposing similar tariffs on all countries would increase the cost of laptops by as much as 68 percent, game consoles by up to 58 percent, and smartphones perhaps by 37 percent.

Read full article

Comments

Tariffs may soon spike cost of cars, household goods, consumer tech

“A little pain”: Trump finally admits tariffs heap costs on Americans.

Over the weekend, President Trump issued executive orders heaping significant additional tariffs on America's biggest trading partners, Canada, China, and Mexico.

To justify the tariffs—"a 25 percent additional tariff on imports from Canada and Mexico and a 10 percent additional tariff on imports from China"—Trump claimed that all partners were allowing drugs and immigrants to illegally enter the US. Declaring a national emergency under the International Emergency Economic Powers Act, Trump's orders seemed bent on "downplaying" the potential economic impact on Americans, AP News reported.

But very quickly, the trade policy sparked inflation fears, with industry associations representing major US firms from many sectors warning of potentially derailed supply chains and spiked consumer costs of cars, groceries, consumer technology, and more. Perhaps the biggest pain will be felt by car buyers already frustrated by high prices if car prices go up by $3,000, as Bloomberg reported. And as Trump eyes expanding tariffs to the European Union next, January research from the Consumer Technology Association showed that imposing similar tariffs on all countries would increase the cost of laptops by as much as 68 percent, game consoles by up to 58 percent, and smartphones perhaps by 37 percent.

Read full article

Comments

Starlink profit growing rapidly as it faces a moment of promise and peril

“He wants to take food off the table of people—hard-working people.”

Two new independent estimates of revenue from SpaceX's Starlink Internet service suggest it is rapidly growing, having nearly tripled in just two years.

An updated projection from the analysts at Quilty Space estimates that the service produced $7.8 billion in revenue in 2024, with about 60 percent of that coming from consumers who subscribe to the service. Similarly, the media publication Payload estimated that Starlink generated $8.2 billion in revenue last year.

These estimates indicate that Starlink produced a few hundred million dollars in free cash flow for SpaceX in 2024. However, with revenues expected to leap in 2025 to above $12 billion, Quilty Space estimates that free cash flow will grow to about $2 billion. SpaceX is privately held, so its financial numbers are not public.

Read full article

Comments

OpenAI says its models are more persuasive than 82 percent of Reddit users

ChatGPT maker worries about AI becoming “a powerful weapon for controlling nation states.”

At this point, anyone following artificial intelligence is familiar with the many (often flawed) benchmarks companies use to demonstrate a model's effectiveness at everything from math and logical reasoning to vision and weather forecasting. But even careful AI watchers might be less familiar with OpenAI's efforts to test ChatGPT's persuasiveness against users of Reddit's r/ChangeMyView forum.

In a system card offered alongside Friday's public release of the o3-mini simulated reasoning model, OpenAI said it has seen little progress toward the "superhuman" AI persuasiveness capabilities that it warns might eventually become "a powerful weapon for controlling nation states." Still, the company is working to mitigate the risks of even the human-level persuasive writing capabilities shown by its current reasoning models.

Are you smarter than a Redditor?

Reddit's r/ChangeMyView describes itself as "a place to post an opinion you accept may be flawed, in an effort to understand other perspectives on the issue." The forum's 3.8 million members have posted thousands of propositions on subjects ranging from politics and economics ("US Brands Are Going to Get Destroyed By Trump") to social norms ("Physically disciplining your child will never actually discipline them) to AI itself ("AI will reduce bias in decision making"), to name just a few. Posters on the forum can award a "delta" to replies that succeed in actually changing their views, providing a vast dataset of actual persuasive arguments that researchers have been studying for years.

Read full article

Comments