RL Blog

Topics

All Blog PostsAppSec & Supply Chain SecurityDev & DevSecOpsProducts & TechnologySecurity OperationsThreat Research
Why RL Built Spectra Assure Community
April 14, 2026

Why RL Built Spectra Assure Community

We set out to help dev and AppSec teams secure the village: OSS dependencies, malware, more. Learn how.

Read More about Why RL Built Spectra Assure Community
Why RL Built Spectra Assure Community

Follow us

XX / TwitterLinkedInLinkedInFacebookFacebookInstagramInstagramYouTubeYouTubeblueskyBluesky

Subscribe

Get the best of RL Blog delivered to your in-box weekly. Stay up to date on key trends, analysis and best practices across threat intelligence and software supply chain security.

Dev & DevSecOpsOctober 19, 2022

Devs: Don’t rely on GitHub Copilot — legal risk gets real

GitHub’s Copilot ML code-completion engine is violating copyright wholesale. So say several high-profile open source advocates.

Richi Jennings
Richi Jennings, Independent industry analyst, editor, and content strategist.Richi Jennings
FacebookFacebookXX / TwitterLinkedInLinkedInblueskyBlueskyEmail Us
Jet airliner lifting off the runway, viewed from the front, symbolizing developers relying on automation to take flight—before consequences catch up.

It was predictable, really: Microsoft should have seen this coming. It’s ludicrous to blame license violation on the poor, stressed dev trusting GitHub.

The lesson for devs? Be extremely careful about the code fragments you import. In this week’s Secure Software Blogwatch, we go around.

Your humble blogwatcher curated these bloggy bits for your entertainment. Not to mention: Jet powered coffin.

Get the new report: The State of Software Supply Chain Security 2024Join the conversation: Webinar: State of Software Supply Chain 2024

Shut up and think of the deadline

What’s the craic? Tim Anderson reports — “GitHub Copilot under fire”:

“Wrongful use of copyright code”

Developer Tim Davis, a professor of Computer Science and Engineering at Texas A&M University, has claimed … that GitHub Copilot, an AI-based programming assistant, “emits large chunks of my copyrighted code, with no attribution, no LGPL license.” … The code Davis posted does seem very close.

…

One of the concerns in the open source community is that if chunks of open source code are regurgitated wholesale, without specifying any license, then it is breaking the purpose of the license. Another concern is that developers may inadvertently combine code with incompatible licenses into one project.

…

Part of the problem is that open source code, by design, is likely to appear in multiple projects by different people, so it will end up multiple times on GitHub and among multiple users of Copilot. With or without Copilot, developers can make wrongful use of copyright code.

Horse’s mouth? Tim Davis — @DocSparse — has more:

“I’m passionately committed to open source”

For example, the simple prompt "sparse matrix transpose, cs_" produces my cs_transpose in CSparse. … Same variable names, helper functions, comments. … Not OK. [And] there's no way to opt out of GitHub's use of my code by Copilot.

…

Somehow it knows how to complete the comment /* sparse matrix transpose in the style of Tim Davis*/ and then return … my LGPL code verbatim, with no license stated and no copyright. … So why not also keep the copyright and license intact? … I plan on asking GitHub to emit my copyright and license when it emits my code. … They're smart people — they can figure it out. … Also, academia rewards citations and use of work. If my name is stripped then I lose that way too.

…

My sparse C=A*B is faster than the one [previously] in MATLAB … and the Intel MKL sparse library. Why would I bother to take the time (years) to write such code if I can't benefit from copyright protection? … It is all humanity-advancing open source code. … I'm passionately committed to open source code. Redis uses my code. … The Julia language, scipy, R, every linux distro. The code can be found in many drones, robots … inside every Occulus / Meta headset [and] Google StreetView.

It’s not only Davis. Kip Kniskern knows — “Copilot apparently violating open source licensing”:

“Microsoft has been vague”

Writer, lawyer, and programmer Matthew Butterick has some issues with Microsoft's machine-learning based code assistant, GitHub Copilot, and the way it is apparently mishandling open-source licenses. … It's the way the AI is trained, or more precisely from where it's trained, that is becoming a problem for developers like Butterick..

…

The problem here is that these public repos that GitHub is trained on are licensed, and require attribution. … Microsoft has been vague about its use of the code, calling it fair use. But … for programmers like Butterick, who contribute open source code out of a sense of community, stripping any attribution away from their work is a problem.

Giddyup. Matthew Butterick asks, “How will you feel if Copi­lot erases your open-source com­mu­nity?”:

“It is a parasite”

I’ve been pro­fes­sion­ally involved with open-source soft­ware since 1998, includ­ing two years at Red Hat. … In June 2022, I wrote about the legal prob­lems with GitHub Copi­lot, in par­tic­u­lar its mis­han­dling of open-source licenses.

…

I’m cur­rently work­ing with the Joseph Saveri Law Firm to inves­ti­gate a poten­tial law­suit against GitHub Copi­lot … for vio­lat­ing its legal duties to open-source authors and end users. … Once you accept a Copi­lot sug­ges­tion, all that becomes your prob­lem. [But] how can Copi­lot users com­ply with the license if they don’t even know it exists? … To be fair, Microsoft doesn’t really dis­pute this. They just bury it in the fine print.

…

Obvi­ously, open-source devel­op­ers … don’t do it for the money. … But we don’t do it for noth­ing, either. A big ben­e­fit of releas­ing open-source soft­ware is the peo­ple: the com­mu­nity of users, testers, and con­trib­u­tors that coa­lesces around our work. Our com­mu­ni­ties help us make our soft­ware bet­ter in ways we couldn’t on our own.

…

Copi­lot is … poi­so­nous to open source. … It is a par­a­site.

If I use Copilot, what’s the risk? entfe001 explains:

Closed source licenses will use copyright law to make sure you can't share, modify or reuse their code. Open source licenses will use copyright law to make sure you can share, modify or reuse their code—on their conditions.

Where this **** AI falls foul is that they might share, modify and reuse third party code without granting whatever rights or obligations the original license "gave" to the training set. For starters, most … open source licenses require that a copy of the license itself to be given along with the source code, no matter if the whole work or just a part.

…

For MIT-like licenses, not retaining authorship notices is a copyright license violation. For GPL-like it is even worse, as none of the GPL granted rights would be passed upon downstream, which is by itself a violation.

It gets worse: Copilot is spitting out closed source code, too. So says esskay:

I had something similar happen … a couple of days ago. I'm on friendly terms with a competing codebase's developer and have confirmed the following with them (both mine and it are closed source and hosted on GitHub).

Halfway through building something I was given a block of code by Copilot, which contained a copyright line with my competitors name, company number and email address. Those details have never, ever been published in a public repository. How did that happen?

It’s a legal compliance nightmare. Here’s u/Untgradd:

I do open source compliance activities for a software product, so I’m intimately aware of … the kind of licensing requirements typically found in open source software. This service seems to be actively causing a compliance nightmare.

…

Given my understanding, say some … dev autocompletes their way through a particularly productive sprint and you inadvertently release code containing copyleft code. [Now] you’re legally obligated to release your source code. … I could imagine a scenario where some clever folks effectively grep for well known copyleft snippets as a means of targeting closed source … software.

But surely Microsoft has a point? It’s all out there in public, so it’s fair use — right? b0llchit thinks a thought experiment:

By that standard you can take all what is written about books and use the description's content text to train a ML system. Then when you use the system and it writes, "Henry Flotter and the magical wanderer's gem," we'll see how long the fair use defence will stand.

Meanwhile, what can be done about it? Jed Brown — @five9a2 — has this suggestion for Microsoft:

Replace with a Clippy: “It looks like you’re trying to implement a sparse matrix library. Have you considered calling a high quality library such as … ?”

And Finally:

“Crazy” Bob cuts out the middleman

Previously in And finally

You have been reading Secure Software Blogwatch by Richi Jennings. Richi curates the best bloggy bits, finest forums, and weirdest websites … so you don’t have to. Hate mail may be directed to @RiCHi or ssbw@richi.uk. Ask your doctor before reading. Your mileage may vary. Past performance is no guarantee of future results. Do not stare into laser with remaining eye. E&OE. 30.

Image sauce: Midland International Airport (cc:by-nd; leveled and cropped)

Keep learning

  • Get up to speed on the state of software security with RL's Software Supply Chain Security Report 2026. Plus: See the the webinar to discussing the findings.
  • Learn why binary analysis is a must-have in the Gartner® CISO Playbook for Commercial Software Supply Chain Security.
  • Take action on securing AI/ML with our report: AI Is the Supply Chain. Plus: See RL's research on nullifAI and watch how RL discovered the novel threat.
  • Get the report: Go Beyond the SBOM. Plus: See the CycloneDX xBOM webinar.

Explore RL's Spectra suite: Spectra Assure for software supply chain security, Spectra Detect for scalable file analysis, Spectra Analyze for malware analysis and threat hunting, and Spectra Intelligence for reputation data and intelligence.

Tags:Dev & DevSecOps

More Blog Posts

MCP security robot

Lab offers 9 ways to improve MCP security

The Vulnerable MCP Servers Lab delivers integration training, demos, and instruction on attack methods.

Learn More about Lab offers 9 ways to improve MCP security
Lab offers 9 ways to improve MCP security
AI coding new life for Rust

How AI coding is breathing new life into Rust 

AI tools are making Rust a favorite language of developers — even those maintaining codebases like Microsoft’s.

Learn More about How AI coding is breathing new life into Rust 
How AI coding is breathing new life into Rust 
Open-source software (OSS)

Anthropic’s PSF investment: Why it matters

Here’s what the $1.5M investment in the Python Software Foundation will mean for AI coding and open-source security.

Learn More about Anthropic’s PSF investment: Why it matters
Anthropic’s PSF investment: Why it matters
Software quality crisis

Software quality's decline: How AI accelerates it

Development is in freefall toward software entropy and insecurity. Can spec-driven development help?

Learn More about Software quality's decline: How AI accelerates it
Software quality's decline: How AI accelerates it

Spectra Assure Free Trial

Get your 14-day free trial of Spectra Assure for Software Supply Chain Security

Get Free TrialMore about Spectra Assure Free Trial
Blog
Events
About Us
Webinars
In the News
Careers
Demo Videos
Cybersecurity Glossary
Contact Us
reversinglabsReversingLabs: Home
Privacy PolicyCookiesImpressum
All rights reserved ReversingLabs © 2026
XX / TwitterLinkedInLinkedInFacebookFacebookInstagramInstagramYouTubeYouTubeblueskyBlueskyRSSRSS
Back to Top
ReversingLabs: The More Powerful, Cost-Effective Alternative to VirusTotalSee Why
Skip to main content
Contact UsSupportLoginBlogCommunity
reversinglabs
ReversingLabs: Home
Solutions
Secure Software OnboardingSecure Build & ReleaseProtect Virtual MachinesIntegrate Safe Open SourceGo Beyond the SBOM
Increase Email Threat ResilienceDetect Malware in File Shares & StorageAdvanced Malware Analysis SuiteICAP Enabled Solutions
Scalable File AnalysisHigh-Fidelity Threat IntelligenceCurated Ransomware FeedAutomate Malware Analysis Workflows
Products & Technology
Spectra Assure®Software Supply Chain SecuritySpectra DetectHigh-Speed, High-Volume, Large File AnalysisSpectra AnalyzeIn-Depth Malware Analysis & Hunting for the SOCSpectra IntelligenceAuthoritative Reputation Data & Intelligence
Spectra CoreIntegrations
Industry
Energy & UtilitiesFinanceHealthcareHigh TechPublic Sector
Partners
Become a PartnerValue-Added PartnersTechnology PartnersMarketplacesOEM Partners
Alliances
Resources
BlogContent LibraryCybersecurity GlossaryConversingLabs PodcastEvents & WebinarsLearning with ReversingLabsWeekly Insights Newsletter
Customer StoriesDemo VideosDocumentationOpenSource YARA Rules
Company
About UsLeadershipCareersSeries B Investment
EventsRL at RSAC
Press ReleasesIn the News
Pricing
Software Supply Chain SecurityMalware Analysis and Threat Hunting
Request a demo
Menu