1
1
submitted 1 year ago* (last edited 1 year ago) by RoundSparrow@bulletintree.com to c/lemmycode@lemmy.ml

I've been pondering the way language works as an attribute in community.

Right now there are two attributes to highlight: NSFW and "only moderators can post".

There is an active pull request to add "only the home instance can allow posting", sort of a variation on the "only moderators" can post: https://github.com/LemmyNet/lemmy/pull/3889

Which is the 20 source file changes for the new feature.

I envision that new attributes will keep coming up, and I see need for an additonal one to make cleaner the !multipass@bulletintree.com feature.

Observation

A general 'attributes' table could be created like the existing language table. And then duplicate the logic in Community Edit for picking the languages associated with a community... except you are picking the attributes associated with a community.

This could possibly cut down on the amount of lemmy_server code changes for each new attribute?

I envision in the future there will be 'members only can post' (subscribed only), and variations of NSFW that people want to implement... and it could be done by not having to add new PostgreSQL columns to tables... and just a general scheme to insert a new registered attribute ID...

2
5
submitted 1 year ago* (last edited 1 year ago) by RoundSparrow@lemmy.ml to c/lemmycode@lemmy.ml

lemmy.world
lemmy.ml

See the difference? Can someone submit/locate an Issue and/or Pull Request? Any others?

3
3
submitted 1 year ago* (last edited 1 year ago) by RoundSparrow@lemmy.ml to c/lemmycode@lemmy.ml

This is taking longer than other aggregate updates, and I think the join can be eliminated:

CREATE FUNCTION public.community_aggregates_comment_count() RETURNS trigger
    LANGUAGE plpgsql
    AS $$
begin
  IF (was_restored_or_created(TG_OP, OLD, NEW)) THEN
update community_aggregates ca
set comments = comments + 1 from comment c, post p
where p.id = c.post_id
  and p.id = NEW.post_id
  and ca.community_id = p.community_id;
ELSIF (was_removed_or_deleted(TG_OP, OLD, NEW)) THEN
update community_aggregates ca
set comments = comments - 1 from comment c, post p
where p.id = c.post_id
  and p.id = OLD.post_id
  and ca.community_id = p.community_id;

END IF;
return null;
end $$;

pg_stat_statements shows it as:

update community_aggregates ca set comments = comments + $15 from comment c, post p where p.id = c.post_id and p.id = NEW.post_id and ca.community_id = p.community_id

TRIGGER:

CREATE TRIGGER community_aggregates_comment_count AFTER INSERT OR DELETE OR UPDATE OF removed, deleted ON public.comment FOR EACH ROW EXECUTE FUNCTION public.community_aggregates_comment_count();

4
1

person_aggregates is interesting, because it is tracked for all known person accounts on the server. Where site_aggregates does not track all know instances on the server.

Personally I think lemmy-ui needs to be revised to clearly identify that a profile is from another instance and that the count of posts, comments, and the listing of same is incomplete. If a user from another instance is being viewed, you only see what your local instance has comments, posts, and votes for. This will almost always under-represent a user from another instance.

PREMISE: since person_aggregate has a SQL UPDATE performed in real-time on every comment or post creation, I suggest we at least make that more useful. A timestamp of 'last_create', either generic to both post or comment, or individual to each type. I also think a last_login timestamp would be of great use - and the site_aggregates of activity could look at these person_aggregates timestamps instead of having to go analyze comments and posts per user on a scheduled job.

5
2
submitted 1 year ago* (last edited 1 year ago) by RoundSparrow@lemmy.ml to c/lemmycode@lemmy.ml

For weeks I've been trying to figure out how to get more organized logging out of the Rust code, and I have never worked with Rust before. The default behavior of lemmy_server is to put all Rust logging into the Linux syslog and you typically filter by the Linux user account 'lemmy'.

Where to write Files ==========
I did "Lemmy from Scratch" and run without Docker. So my lemmy_server lemmy I throw logs into /home/lemmy/logs folder. I don't think Docker even creates a /home/lemmy directory, not sure. I'm inclined to make this an environment variable to set the output folder - suggestions welcome.

Rust Library =========
crate add tracing-appender

Code surgery to /src/lib.rs ==========

use tracing::Level;
use tracing_appender::non_blocking::WorkerGuard;
use tracing_subscriber::{
    fmt::{self, writer::MakeWriterExt}
};


pub fn init_logging(opentelemetry_url: &Option<Url>) -> Result<Vec<Option<WorkerGuard>>, LemmyError> {
  LogTracer::init()?;

  let log_dir = Some("/home/lemmy/logs");
  // Code may have multiple log files (target tags) in the future, so return an array.
  let mut guards = Vec::new();

  let file_filter = Targets::new()
    .with_target("federation", Level::TRACE);

  let file_log = log_dir
      .map(|p| tracing_appender::non_blocking(tracing_appender::rolling::daily(p, "federation")))
      .map(|(appender_type, g)| {
          let guard = Some(g);
          guards.push(guard);
          fmt::Layer::new()
              .with_writer(appender_type.with_max_level(Level::TRACE))
              .with_ansi(false)
              .with_filter(file_filter)
      });

  let log_description = std::env::var("RUST_LOG").unwrap_or_else(|_| "info".into());

  let targets = log_description
    .trim()
    .trim_matches('"')
    .parse::<Targets>()?;

  let format_layer = {
    #[cfg(feature = "json-log")]
    let layer = tracing_subscriber::fmt::layer().json();
    #[cfg(not(feature = "json-log"))]
    let layer = tracing_subscriber::fmt::layer();

    layer.with_filter(targets.clone())
  };

  let subscriber = Registry::default()
    .with(format_layer)
    .with(ErrorLayer::default())
    .with(file_log)
    ;

  if let Some(_url) = opentelemetry_url {
    #[cfg(feature = "console")]
    telemetry::init_tracing(_url.as_ref(), subscriber, targets)?;
    #[cfg(not(feature = "console"))]
    tracing::error!("Feature `console` must be enabled for opentelemetry tracing");
  } else {
    set_global_default(subscriber)?;
  }

  tracing::warn!("zebracat logging checkpoint A");
  tracing::info!(target:"federation", "zebracat file logging startup");

  Ok(guards)
}

Code surgery to /src/main.rs ==========

let _guards = init_logging(&SETTINGS.opentelemetry_url)?;

How to use ================
Anywhere you want in Lemmy code, you can now log to your file: tracing::info!(target:"federation", "this will go to the file");

6
1

"crates/apub/src/activities/community/announce.rs", line: 46

That line of code seems like just a logging line, any Rust programmers chime in on how we can get the actual value of the data logged?

https://github.com/LemmyNet/lemmy/blob/0c82f4e66065b5772fede010a879d327135dbb1e/crates/apub/src/activities/community/announce.rs#L46

7
1
  1. If the person who creates a post deletes it, how does that impact the comments that other users have made on that post? Can a user still find their personal comments on their user profile? Can you still find saved comments and posts from a deleted post?

  2. What if it is an entire Lemmy instance that goes offline (shut down). Are postings removed? Comments? Profiles?

8
1

See comment

9
1
submitted 1 year ago* (last edited 1 year ago) by RoundSparrow@lemmy.ml to c/lemmycode@lemmy.ml

With setting value:

  1. only local data writes, returning a specific error to federation peers that it is in read-only mode and not accepting new comments, posts, likes, subscribes from peer servers.

  2. a federation-only mode where local users are not allowed to change their settings, delete their data, create new posts, create new comments, record votes, edit, etc.

  3. read-only mode site-wide, both local API users and federated peers not allowed to write data.

The second one would also be useful when a site wants to go offline and has a period of time it is 'archived' like GitHub archives projects.

I think upgrade instructions for new versions would have recommendation to set this on for a 5 minute period and do a shakedown read-only test of the site before throwing the switch to allow new data inserts into the PostgreSQL database backend.

Standard response codes issued to all peer servers and API clients as to the specific read-only nature of the instance so they can generate an appropriate message in their user interface.

I am making this recommendation based on the version 0.18.1 severe performance problems that Lemmy is having with 1 million comments in the database. To scale, it is essential that new versions can be released into production and rolled-back.

10
1
11
1

12
1
submitted 1 year ago* (last edited 1 year ago) by RoundSparrow@lemmy.ml to c/lemmycode@lemmy.ml

From personal experience, I know Lemmy.ml, Beehaw.org, Lemmy.world are performing very badly. So far, I have not been able to convince any of hese big server operators to share in bulk their lemmy_server logging as to what is going on.

Tuning and testing is difficult because 1) the less data you have, the faster Lemmy becomes. The big servers have accumulated more data. 2) the less federation activity you have, the less likely you are to run into resource limits and timeout values. These big servers have large numbers of peer servers subscribing to communities.

Nevertheless, we need to do everything we can to try and help the project as a whole.

 

HTTP and Database Parameters

https://github.com/LemmyNet/lemmy/blob/0f91759e4d1f7092ae23302ccb6426250a07dab2/crates/db_schema/src/utils.rs#L45C1-L47C69

const FETCH_LIMIT_DEFAULT: i64 = 10;
pub const FETCH_LIMIT_MAX: i64 = 50;
const POOL_TIMEOUT: Option<Duration> = Some(Duration::from_secs(5));

https://github.com/LemmyNet/lemmy/blob/0f91759e4d1f7092ae23302ccb6426250a07dab2/src/lib.rs#L39

/// Max timeout for http requests
pub(crate) const REQWEST_TIMEOUT: Duration = Duration::from_secs(10);

See also that Lemmy Rust code has a 5-second default PostgreSQL connection timeout for pooling, and default of 5 pool instances. https://github.com/LemmyNet/lemmy/issues/3394

 

lemmy_server behavior

Exactly what gets logged in the Rust code if these values are too low? Can we run a less-important (testing) server with these values set to just 1 and look at what is being logged so we can notify server operators what to grep the logs for?

What are the symptoms?

What can we do to notify server operators that this is happening? Obviously a database resource suggests that using a database table to increase an error count might run into problems under heavy load. Can we have a connection to the database server with higher timeouts and a dedicated table (with no locks) outside the connection pool and have the error logic set a timestamp and count of when these resource limits are being hit in production?

13
1
14
1

Example posting: https://beehaw.org/post/823476

If you hand-count the comments, there are 6. But the posting says 10.

This could involve a lot of data, do it in batches? Please share what SQL you can come up with. Thank you.

15
1

see comments

16
1
submitted 1 year ago* (last edited 1 year ago) by RoundSparrow@lemmy.ml to c/lemmycode@lemmy.ml

I'm working on lemmy-ui-svelte frontend/client application. This code page in particular: https://github.com/ando818/lemmy-ui-svelte/blob/main/src/routes/post/%5Bid%5D/%2Bpage.server.js

Testing the same API against SJW server with two different postings that both have comments:

Posting 372144 returns nothing

curl "https://sh.itjust.works/api/v3/comment/list?post_id=372144&limit=300&sort=New"

Posting 123406 returns comments

curl "https://sh.itjust.works/api/v3/comment/list?post_id=123406&limit=300&sort=New"

Am I making some kind of mistake here? Is it because the problematic posting community is from a federated instance and not local to SJW? Thank you.

EDIT: ok, adding &type_=All to the parameters gave the comments on both posts.

17
1
submitted 1 year ago* (last edited 1 year ago) by RoundSparrow@lemmy.ml to c/lemmycode@lemmy.ml

This is over a decade old, but the general idea here is to do what you can to simulate overloaded database so that the Rust and NodeJS code is tested to gracefully report to the end-user (and log to server operators) that the database backend is failing.

https://www.endpointdev.com/blog/2012/11/how-to-make-postgresql-query-slow/

Lemmy has been falling over to nginx 50x errors under overloads, denial of service situations. Try to exercise the error behavior and harden the Rust and NodeJS code.

18
1
19
1
20
1

Lemmy Code / App Technical

1 readers
1 users here now

The code and application behind Lemmy. Beta testing new releases, API coding, custom changes, adding new features, developers

See also: !lemmyperformance@lemmy.ml community, it's not always clear which one to put a topic into. "lemmycode" I'm trying to be more into actual code change proposals.

!lemmydev@lemm.ee

founded 1 year ago
MODERATORS