The ethical minefield of GenAI

What you need to know and how you can use it responsibly

I’m sure James Madison, the father of copyright law, would have had something to say about Generative AI.

“Open AI is the antichrist.”

That’s how a conversation started with a friend of mine in January 2023. He saw the future, and he did not like it. I was living in the future, and saw the benefits.

Like any conversation I try to have with my friends, we met in the middle, however uncomfortable it was, and continue to do so to this day.

Here were the positions:

As a former journalist, he saw the destruction of content creation, full stop. For some content, he is not wrong. There are particular segments of our workforce that saw the effect that early on, and continue to do so at an accelerated rate.Myself as a former journalist to a lesser degree and working at a document technology startup, I also saw the advantages like transferring some mind-numbing work to a system so people could be more strategic.

I walked him back from his initial concerns — copyright, bias, privacy and a few other categories listed here — and we agreed that as a domain, legal technology is one of the few use cases where this technology avoids a lot of the challenges.

The users utilize it for everything from searching across documents to extracting data from their own content.This content is in a private, secure bubble, because it’s company data it has to be.Our customers use it responsibly because of who they are: professionals with an eye on the cutting edge yet respecting thousands of years of precedent, and are establishing policies to use the technology responsibly.

That last point alone gives me security that everyone will come to their own conclusions about how to use it, but all three validate why so much money is in the document space versus other domains: it’s one of the few places AI makes sense, and it’s in a space that’s been using machine learning models for years, just not at this level of innovation. They’ve also been doing it responsibly.

However, not everyone is there, so you personally have to act accordingly.

Generative AI is a tool, a transformation that’s going to change our lives, some of it for the better. Like any tool, you have to use it responsibly. Anyone can use a hammer in an irresponsible manner, and the same applies here.

I’m going to approach it like I’m telling the weather: staying as neutral as possible, but highlighting the concerns that even I have for the technology in the public square. The world isn’t a fair place, but you can make it more fair by how you personally act, contributing to a better global village.

It’s up to you.

Copyright

Let’s be clear: AI companies like OpenAI, Anthropic and Google used a lot of internet content to train their models, so much that they’re running out of it.

Much of it is copyrighted, and they didn’t exactly ask permission — I didn’t get an email personally for my blog unless it went to my spam folder, as an example. I’ve even gone so far as saying that Wikipedia is probably the foundation of it, and without them they would not exist without Creative Commons as the copyright model.

This alone is a really sticky legal problem.

Most of the companies are claiming fair use: Section 107 of the Copyright Act defines it as “criticism, comment, news reporting, teaching (including multiple copies for classroom use), scholarship, [and] research.” It’s vague enough that it fits the “I’ll know when I see it” model.

This interpretation of copyright law is something I’m fairly sure that James Madison didn’t anticipate in 1787 when he was trying to figure out where to put his AOL CD into the horse. That’s a really generous stance and will have to be resolved legally. It will take decades.

We will need new laws to deal with AI and copyright. This discussion will go on forever among technology experts and lawyers as the technology evolves as with previous copyright issues.

And it’s not new.

As a example, they haven’t really solved copyright issues for social media networks — Instagram claims that you are granting license to them so they can use the content — and this been around for decades. Regulations like GDPR (2018) and CCPA (2020) went into effect fairly recently, and social media companies still find ways to live at the edge.

Another example? Font licensing to the chagrin of Adobe and other font owners. Most designers don’t know that fonts are not copyrightable and the font foundries have been trying to change that for years.

The AI copyright issue will take decades to resolve, and I guarantee almost no one will be happy with the solution, but we’ll live with it.

Resolution

There is no easy solution, especially for content already consumed by the LLMs. Technology companies are going to do what they have always done, which is have a flexible definition of what they can do and we can adjust accordingly (read: Uber) and this is the case here.

The law will adjust accordingly, it’ll just take time.

How you can use it responsibly is always consider it as a derivative work; I do a pretty heavy edit, or don’t use images that look like anything that’s been copyrighted. I’ll also use it for an edit pass, but never as the core content.

Data Privacy

As a repeat of the last copyright, AI companies like OpenAI and Anthropic use a lot of information. This sometimes includes details that shouldn’t be in the public square, whether it be company data or personal information.

They’re trying to keep private information safe, but in the same way it’s hard for other companies to secure their system because all it takes is one hole, the same goes for this technology.

There’s always some way to unlock technology, and someone always will.

This is a problem that will never be unique. Being worried about how Open AI is handling your data is ignoring that there are many companies that have much more information about you, and it is just about as secure. Additionally, there’s a lot more more information out there that is public or hidden in plain sight — there’s an amazing about of information about companies you can find on the SEC Edgar database, for example — so even the notion of privacy doesn’t apply to public companies by law.

Data privacy will always be an issue in modern society; it’s how we approach our perception about it is what matters.

Resolution

If you want to keep something private, don’t put it on the internet, ever. It’s up to you to decide what should be there or not. Many companies are establishing policies around this — one particular acquaintance said they have locked down all corporate systems — and that’s a good thing.

Protecting sensitive information is a must during this time.

Authenticity

When was the last time I saw a LLM hallucinate?

Today.

I was walking someone through one of the applications, and entered their name. It returned the information of someone completely different at the company, and returned other incorrect information.

This doesn’t happen a lot, but it does happen more on content that the LLMs don’t have a lot of context for, or there’s a lot of matches that it can’t line up. For example, an acquaintance of mine has a rather common name and it mixes him up with an actor in England. I don’t have that problem except for that pesky Senior Vice President of Finance that I’m friends with on Facebook, so it’s the information that is returned for me is most probably me.

We laughed about it and moved on.

When you add that all information on the internet isn’t factual — yes, there isn’t an Easter Bunny or Santa Claus either except on some marketing site selling costumes for both — authenticity is closely related to bias.

They’re trying to fix these problems, but it’s hard because the LLMs are designed to predict what words come next, not to know what’s true. Neither does Google. About every other search engine that’s been invented know what’s true without human intervention, either.

We’ve been living with this for the last 30 years, and will continue to do so. It doesn’t seem to affect search engine engagement either.

It’s really up to us to decide what’s right and what’s wrong. Search engines return wrong results sometimes, and so do the GPTs. Both are learning to get better, and it’ll take time.

Resolution

I have told everyone that uses any technology that you should always double check what you’re seeing like a journalist: One source is an opinion, but a second reliable source is validation. The same confidence that LLMs return information with is akin to the Google search results, and no one seems to be affected by that.

The resolution: Trust and verify.

Bias

Repeat after me 100 times: “People are biased, so data is biased.”

The issue of bias in AI models is part of a larger, ongoing conversation about ethics and fairness in artificial intelligence, but we forget these systems are built by humans.

The reality of the world is that it isn’t a fair place, and this is reflected in the data we generate as a society.

For example, the greater predictor of success in your life is what zip code you grew up in, full stop. For the record, mine was 92804 — I went to Walt Disney Elementary School in Anaheim, California. The only benefit was a free trip in the sixth grade.

This is not a new problem — who can forget when facial recognition software performed less accurately for certain racial groups, or job application screening algorithms favoring particular demographics — language models are the latest technology to face this challenge.

Efforts to address bias in AI are part of a larger push for “responsible AI” or “ethical AI” This movement includes not just addressing bias, but also concerns about AI transparency, accountability, privacy protection, and potential misuse.

It’s going to take a long time.

Resolution

Until we have transparency, it’s up to you. As with anything you see, there’s going to be a lens of perception that you’ll have to view the information with. You’ll determine how much bias there is, and you will have to convert it to your own mental model.

This conversion is something we all do, every day of the week.

We also have to call for responsible AI. It’s important that there is some type of global movement to get there. It’ll be people and laws. A mix.

Conclusion

Like technology solutions before — Napster comes to mind — there’s always a way all of this works through the system to less than ideal solutions that we all accept, flaws and all. Musicians don’t like the revenue of Spotify, but they still use it and other platforms.

Other examples:

Search engines have their flaws and don’t know the truth, but there’s still billions in advertising revenue generated.Merchants might not necessarily like Amazon, but they accept the platform because of the massive reach.We all will accept the risks of Generative AI once we see the benefits.

It’s all about responsible usage. We learned from the other applications, and we’ll learn from this one.

We have to approach it from a realistic view of not only how it is implemented, but how it is analogous with the technology issues in the past, and this will help us solve the future, plain and simple.

To quote the movie Contact:

“You’re an interesting species. An interesting mix. You’re capable of such beautiful dreams, and such horrible nightmares. You feel so lost, so cut off, so alone, only you’re not. See, in all our searching, the only thing we’ve found that makes the emptiness bearable, is each other.”

Be responsible, Trust and verify. Campaign for change so there are adequate guardrails in place.

In my glass half full thoughts are that we’ll get there, one person at a time.

This is the end of my TED talk.

M	T	W	T	F	S	S
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30
31

The ethical minefield of GenAI

What you need to know and how you can use it responsibly

Copyright

Resolution

Data Privacy

Resolution

Authenticity

Resolution

Bias

Resolution

Conclusion

Other articles worth reading about this

Recent Articles

Related Articles