1

In human conversations, individuals can indicate relevant regions within a scene while addressing others. In turn, the other person can then respond by referring to specific regions if necessary. This natural referential ability in dialogue remains absent in current Multimodal Large Language Models (MLLMs). To fill this gap, this paper proposes an MLLM called Shikra, which can handle spatial coordinate inputs and outputs in natural language. Its architecture consists of a vision encoder, an alignment layer, and a LLM. It is designed to be straightforward and simple, without the need for extra vocabularies, position encoder, pre-/post-detection modules, or external plug-in models. All inputs and outputs are in natural language form. Referential dialogue is a superset of various vision-language (VL) tasks. Shikra can naturally handle location-related tasks like REC and PointQA, as well as conventional VL tasks such as Image Captioning and VQA. Experimental results showcase Shikra's promising performance. Furthermore, it enables numerous exciting applications, like providing mentioned objects' coordinates in chains of thoughts and comparing user-pointed regions similarities. Our code, model and dataset are accessed at this https URL.

1
submitted 1 year ago* (last edited 1 year ago) by Martineski@lemmy.fmhy.ml to c/singularity@lemmy.fmhy.ml

Amazon CEO Andy Jassy called generative A.I. “one of the biggest technical transformations of our lifetimes” in an interview with CNBC on Thursday. He also called many of today’s A.I. chatbots and other generative A.I. tools part of the “hype cycle,” declaring that Amazon was focused on the “substance cycle.”

Amazon’s bona fides in the space are well established, having been a player in artificial intelligence and machine learning long before the ChatGPTs and Bards of the world were publicly released. Former Fortune editor Brian Dumaine wrote a book in 2020 about how Amazon founder Jeff Bezos realized early on that imbuing machine learning into every facet of the company would allow it to gather data to constantly improve itself.

Much as it did with Amazon Web Services, which practically birthed the cloud computing industry that now powers the internet’s biggest companies, including its competitors, Amazon’s A.I. strategy is focused on cementing its position as a major player across the entirety of the A.I. supply chain.

“Every single business unit inside of Amazon is working intensely and very broadly on generative A.I.,” Jassy says.

Jassy shed some light on Amazon’s A.I. game plan, outlining three macro layers: the computing capabilities, the underlying models, and what Jassy refers to as the “application layer,” for example, ChatGPT or Bard.

1

For 3D object manipulation, methods that build an explicit 3D representation perform better than those relying only on camera images. But using explicit 3D representations like voxels comes at large computing cost, adversely affecting scalability. In this work, we propose RVT, a multi-view transformer for 3D manipulation that is both scalable and accurate. Some key features of RVT are an attention mechanism to aggregate information across views and re-rendering of the camera input from virtual views around the robot workspace. In simulations, we find that a single RVT model works well across 18 RLBench tasks with 249 task variations, achieving 26% higher relative success than the existing state-of-the-art method (PerAct). It also trains 36X faster than PerAct for achieving the same performance and achieves 2.3X the inference speed of PerAct. Further, RVT can perform a variety of manipulation tasks in the real world with just a few (∼10) demonstrations per task. Visual results, code, and trained model are provided at this https URL.

5

Covid-19 is said to cause long-term side effects in up to 67% of patients, and these health consequences can include chronic fatigue, loss of taste and smell and brain fog. Increasingly common too is Covid-related hair loss. Known as telogen effluvium, this phenomenon manifests as clumps of hair falling out after brushing or washing your hair.

It’s normal to shed hair daily – we lose about 100-150 hairs each day as hair drops from follicles to make way for new hair growth. This growth cycle occurs because 90% of the hair on our heads is in a growth phase (called anagen), while the remaining 10% is in a resting phase (called telogen). Anagen lasts for about three years before transitioning into the shorter telogen phase, following which hair is shed.

A stressful event like childbirth, certain medications, intense psychological stress and Covid-19 can trigger our bodies to shift a greater-than-normal proportion of growing anagen hairs into a resting telogen state, according to the University of Utah.

“Covid-related hair loss can affect up to 33% of symptomatic patients and 10% of asymptomatic patients,” says a plastic surgeon who deals with hair loss patients. “And this kind of hair loss seems to be different from that induced by stress or disease as cytokines (substances secreted by the body’s immune system) appear to cause direct damage to hair follicles,” she adds.

Covid-induced hair loss has also been reported to start earlier after the stressful event – in two months instead of the usual three.

2

Recent work suggests that interpolating between the weights of two specialized language models can transfer knowledge between tasks in a way that multi-task learning cannot. However, very few have explored interpolation between more than two models, where each has a distinct knowledge base. In this paper, we introduce Derivative Free Weight-space Ensembling (DFWE), a new few-sample task transfer approach for open-domain dialogue. Our framework creates a set of diverse expert language models trained using a predefined set of source tasks. Next, we finetune each of the expert models on the target task, approaching the target task from several distinct knowledge bases. Finally, we linearly interpolate between the model weights using a gradient-free-optimization algorithm, to efficiently find a good interpolation weighting. We demonstrate the effectiveness of the method on FETA-Friends outperforming the standard pretrain-finetune approach.

1

The idea is simple - Specify what you want to research, and the AI will autonomously research it for you in minutes!

▸ One prompt generates an unbiased, factual and in depth research report

▸ Generate research, outlines, resource and lessons reports

▸ Aggregates over 20 web sources per research

▸ Includes an easy to use web interface

▸ Open source: https://github.com/assafelovic/gpt-researcher

▸ Scrapes web sources with javascript support

▸ Keeps track and context of visited and used web sources

1

Abstract:

Since the first laser was invented, the pursuit of high-energy lasers (HELs) has always been enthusiastic. The first revolution of HELs was pushed by the fusion of laser and aerospace in the 1960s, with the chemical rocket engines giving fresh impetus to the birth of gas flow and chemical lasers, which finally turned megawatt lasers from dream into reality. Nowadays, the development of HELs has entered the age of electricity as well as the rocket engines. The properties of current electric rocket engines are highly consistent with HELs’ goals, including electrical driving, effective heat dissipation, little medium consumption and extremely light weight and size, which inspired a second fusion of laser and aerospace and motivated the exploration for potential HELs. As an exploratory attempt, a new configuration of diode pumped metastable rare gas laser was demonstrated, with the gain generator resembling an electric rocket-engine for improved power scaling ability.

1
1
submitted 1 year ago* (last edited 1 year ago) by Martineski@lemmy.fmhy.ml to c/singularity@lemmy.fmhy.ml

Original title: Focused Transformer: Contrastive Training for Context Scaling

Large language models have an exceptional capability to incorporate new information in a contextual manner. However, the full potential of such an approach is often restrained due to a limitation in the effective context length. One solution to this issue is to endow an attention layer with access to an external memory, which comprises of (key, value) pairs. Yet, as the number of documents increases, the proportion of relevant keys to irrelevant ones decreases, leading the model to focus more on the irrelevant keys. We identify a significant challenge, dubbed the distraction issue, where keys linked to different semantic values might overlap, making them hard to distinguish. To tackle this problem, we introduce the Focused Transformer (FoT), a technique that employs a training process inspired by contrastive learning. This novel approach enhances the structure of the (key, value) space, enabling an extension of the context length. Our method allows for fine-tuning pre-existing, large-scale models to lengthen their effective context. This is demonstrated by our fine-tuning of 3B and 7B OpenLLaMA checkpoints. The resulting models, which we name LongLLaMA, exhibit advancements in tasks requiring a long context. We further illustrate that our LongLLaMA models adeptly manage a 256k context length for passkey retrieval.

2
6
hmmm (lemmy.fmhy.ml)
16
Innocent (lemmy.fmhy.ml)
[-] Martineski@lemmy.fmhy.ml 52 points 1 year ago

Oh, I'm not discrediting him. I'm just pointing out that he gives all the credit to himself like he always does with everything he touches.

[-] Martineski@lemmy.fmhy.ml 97 points 1 year ago

This fucking loser also lies to people that he's the reason OpenAI exists LMFAO: https://www.youtube.com/watch?v=bWr-DA5Wjfw ("Elon Musk on Sam Altman and ChatGPT: I am the reason OpenAI exists")

[-] Martineski@lemmy.fmhy.ml 48 points 1 year ago* (last edited 1 year ago)

either way good job on joining the union.

Me who simply posts daily images/memes to kickstart the sub:

[-] Martineski@lemmy.fmhy.ml 42 points 1 year ago

Yes, we should move away from the platform and this will definitely help with that.

[-] Martineski@lemmy.fmhy.ml 69 points 1 year ago

I hope for the opposite, this platform should die IMO

[-] Martineski@lemmy.fmhy.ml 54 points 1 year ago* (last edited 1 year ago)

Plenty of value was lost, a lot of news on my singularity sub were posted on twitter because for some reason people thought it's a great decision to hold on to dying platform.

[-] Martineski@lemmy.fmhy.ml 36 points 1 year ago

Youtube also is starting to experiment with locking people from watching videos if they have adblock turned on. We live in a truly great times where those behemoths are deciding to kill off themselves at the same time and it makes me truly happy. Open source FTW.

[-] Martineski@lemmy.fmhy.ml 32 points 1 year ago* (last edited 1 year ago)

Well, there go all my links to tweets on my singularity sub. Hopefully this finally pushes posters to switch to another platform to share news about stuff.

Edit: Same as paywalled articles, I won't be allowing links to tweets as their main focus from now on. It's time for this platform to die.

[-] Martineski@lemmy.fmhy.ml 29 points 1 year ago

Fricking flairs, they're very important in the communities that I'm moderating. With an ability to set multiple flairs at once because on reddit you can set only one which sucks because some posts can fit criteria to get 2 or more flairs.

[-] Martineski@lemmy.fmhy.ml 38 points 2 years ago* (last edited 1 year ago)

Blocking instances for having open registration is some of the weirder shit I have heard recently.

Edit: now that we have a bot problem I'm now against open registration and federating with instances with open registration lol

[-] Martineski@lemmy.fmhy.ml 40 points 2 years ago* (last edited 2 years ago)

I went there for a moment to check if any of my subreddits were making polls and the amount of redditors against this strike was depressing. I'm happy that I left this shithole and those braindead people behind.

view more: next ›

Martineski

joined 2 years ago
MODERATOR OF