Machine Learning | Artificial Intelligence

26

11

Machine-learning system based on light could yield more powerful, efficient large language models (news.mit.edu)

submitted 2 years ago by kromem@lemmy.world to c/machinelearning@lemmy.world

0 comments fedilink

27

3

Risky Giant Steps Can Solve Optimization Problems Faster (www.quantamagazine.org)

submitted 2 years ago by AlmightySnoo@lemmy.world to c/machinelearning@lemmy.world

0 comments fedilink

This is about Benjamin Grimmer's paper https://arxiv.org/abs/2307.06324 where he proves under certain conditions that large steps lead to faster convergence.

28

10

Recommendations for a context aware text classifier (lemmy.world)

submitted 2 years ago by Bluetreefrog@lemmy.world to c/machinelearning@lemmy.world

3 comments fedilink

I've got a bot running/in development to detect and flag toxic content on Lemmy but I'd like to improve on it as I'm getting quite a few false positives. I think that part of the reason is that what constitutes toxic content often depends on the parent comment or post.

During a recent postgrad assignment I was taught (and saw for myself) that a bag of words model usually outperforms LSTM or transformer models for toxic text classification, so I've run with that, but I'm wondering if it was the right choice.

Does anyone have any ideas on what kind of model would be most suitable to include a parent as context, but to not explicitly consider whether the parent is toxic? I'm guessing some sort of transformer model, but I'm not quite sure how it might look/work.

29

7

Agencies for ML jobs in Eastern Australia (lemmy.world)

submitted 2 years ago by Bluetreefrog@lemmy.world to c/machinelearning@lemmy.world

0 comments fedilink

Hi everyone, sorry if this is not the right community, just let me know if so.

I'm wondering if anyone has any recommendations for which job agencies to register with for ML Jobs. I have experience with Python using mainly Pytorch and a bit of tensorflow a few years ago.

30

1

“AI” Hurts Consumers and Workers -- and Isn’t Intelligent (techpolicy.press)

submitted 2 years ago by JRepin@lemmy.ml to c/machinelearning@lemmy.world

0 comments fedilink

cross-posted from: https://lemmy.ml/post/2811405

"We view this moment of hype around generative AI as dangerous. There is a pack mentality in rushing to invest in these tools, while overlooking the fact that they threaten workers and impact consumers by creating lesser quality products and allowing more erroneous outputs. For example, earlier this year America’s National Eating Disorders Association fired helpline workers and attempted to replace them with a chatbot. The bot was then shut down after its responses actively encouraged disordered eating behaviors. "

31

8

New AI systems collide with copyright law (www.bbc.co.uk)

submitted 2 years ago by soyagi@yiffit.net to c/machinelearning@lemmy.world

0 comments fedilink

32

8

Looking for resources on music generation (lemmy.world)

submitted 2 years ago by f4hy@lemmy.world to c/machinelearning@lemmy.world

2 comments fedilink

I am an ML engineer/researcher but have never looked into music before. Some quick googling gives plenty of websites doing automatic music generation but not sure what methods/ architectures are being used. I'm sure I could find papers with more searching but hoping someone can give me a summary of current SOTA and maybe some links to code/models to get started on.

33

7

Discussion of llama source code (lemmy.world)

submitted 2 years ago by LordShrek@lemmy.world to c/machinelearning@lemmy.world

1 comments fedilink

where can i go to learn about and discuss facebook's llama 2 source code? there aren't many comments in the code.

34

7

What tools/libraries do you for MLOps? (lemmy.world)

submitted 2 years ago by realz@lemmy.world to c/machinelearning@lemmy.world

1 comments fedilink

The MLOps community is flooding tools and pipeline orchestration tools. What does your stack look like?

35

7

3D-LLM Injecting the 3D World into Large Language Models (vis-www.cs.umass.edu)

submitted 2 years ago by EthicalAI@lemmy.world to c/machinelearning@lemmy.world

0 comments fedilink

36

2

PaLM-E: An embodied multimodal language model (ai.googleblog.com)

submitted 2 years ago by EthicalAI@lemmy.world to c/machinelearning@lemmy.world

0 comments fedilink

37

4

Almost All Research on the Mind is in English. That May Be a Problem (www.wired.com)

submitted 2 years ago by ZephyrXero@lemmy.world to c/machinelearning@lemmy.world

0 comments fedilink

38

0

Large language models encode clinical knowledge (www.nature.com)

submitted 2 years ago by kromem@lemmy.world to c/machinelearning@lemmy.world

0 comments fedilink

An update on Google's efforts at LLMs in the medical field.

39

1

Google’s language model “NotebookLM” app hits public testing (arstechnica.com)

submitted 2 years ago by DannyMac@lemmy.world to c/machinelearning@lemmy.world

0 comments fedilink

40

2

Generative AI Goes 'MAD' When Trained on AI-Created Data Over Five Times (www.tomshardware.com)

submitted 2 years ago by DannyMac@lemmy.world to c/machinelearning@lemmy.world

0 comments fedilink

41

3

New ChatGPT rival, Claude 2, launches for open beta testing (arstechnica.com)

submitted 2 years ago by DannyMac@lemmy.world to c/machinelearning@lemmy.world

1 comments fedilink

42

0

GPT-4 API general availability and deprecation of older models in the Completions API (openai.com)

submitted 2 years ago by kromem@lemmy.world to c/machinelearning@lemmy.world

0 comments fedilink

43

2

Great series by Andrej Karpathy on machine learning and training (www.youtube.com)

submitted 2 years ago by TommyCat@lemmy.world to c/machinelearning@lemmy.world

0 comments fedilink

Great series on machine learning. Posting for anyone interested in more of the details on the AI's and LLM's and how they're built/trained.

44

1

Adventures in AI Programming: Daily Experiments with GPT-4 (reticulated.net)

submitted 2 years ago by lanolinoil@lemmy.world to c/machinelearning@lemmy.world

0 comments fedilink

45

1

A newbie question on neural network (sopuli.xyz)

submitted 2 years ago by wargreymon2023@sopuli.xyz to c/machinelearning@lemmy.world

1 comments fedilink

In the hidden layer, the activation function will decide what is being determined by the neural network, is it possible for an AI to generate activation function for itself so it can improve upon itself?

46

1

Training AI on other AI causes models to collapse (original title : The AI is eating itself) (www.platformer.news)

submitted 2 years ago by A_A@lemmy.world to c/machinelearning@lemmy.world

0 comments fedilink

Hi lemmings, what do you think about this and do you see a parallel with the human mind ?

... "A second, more worrisome study comes from researchers at the University of Oxford, University of Cambridge, University of Toronto, and Imperial College London. It found that training AI systems on data generated by other AI systems — synthetic data, to use the industry’s term — causes models to degrade and ultimately collapse" ...

47

1

New ROCm™ 5.6 Release Brings Enhancements and Optimizations for AI and HPC Workloads (community.amd.com)

submitted 2 years ago by AlmightySnoo@lemmy.world to c/machinelearning@lemmy.world

0 comments fedilink

cross-posted from: https://lemmy.world/post/811496

Huge news for AMD fans and those who are hoping to see a real* open alternative to CUDA that isn't OpenCL!

*: Intel doesn't count, they still have to get their shit together in rendering things correctly with their GPUs.

We plan to expand ROCm support from the currently supported AMD RDNA 2 workstation GPUs: the Radeon Pro v620 and w6800 to select AMD RDNA 3 workstation and consumer GPUs. Formal support for RDNA 3-based GPUs on Linux is planned to begin rolling out this fall, starting with the 48GB Radeon PRO W7900 and the 24GB Radeon RX 7900 XTX, with additional cards and expanded capabilities to be released over time.

48

1

Full DragGAN source code is now released: Interactive Point-Based Manipulation of Images (github.com)

submitted 2 years ago by Hopps@lemmy.world to c/machinelearning@lemmy.world

0 comments fedilink

49

1

MPT-30B: Raising the bar for open-source foundation models (www.mosaicml.com)

submitted 2 years ago by AlmightySnoo@lemmy.world to c/machinelearning@lemmy.world

0 comments fedilink

and another commercially viable open-source LLM!

50

1

MIT researchers make language models scalable self-learners (news.mit.edu)

submitted 2 years ago* (last edited 2 years ago) by Hopps@lemmy.world to c/machinelearning@lemmy.world

0 comments fedilink

TLDR Summary:

MIT researchers developed a 350-million-parameter self-training entailment model to enhance smaller language models' capabilities, outperforming larger models with 137 to 175 billion parameters without human-generated labels.
The researchers enhanced the model's performance using 'self-training,' where it learns from its own predictions, reducing human supervision and outperforming models like Google's LaMDA, FLAN, and GPT models.
They developed an algorithm called 'SimPLE' to review and correct noisy or incorrect labels generated during self-training, improving the quality of self-generated labels and model robustness.
This approach addresses inefficiency and privacy issues of larger AI models while retaining high performance. They used 'textual entailment' to train these models, improving their adaptability to different tasks without additional training.
By reformulating natural language understanding tasks like sentiment analysis and news classification as entailment tasks, the model's applications were expanded.
While the model showed limitations in multi-class classification tasks, the research still presents an efficient method for training large language models, potentially reshaping AI and machine learning.