🤍 Permit.io AI Access Control is Coming to Product Hunt 🤍
Permit logo
Home/Blog/

AI Content Moderator - Why and how to build one with Gemini 1.5

How and why would you build an AI-based content moderator? Learn how AI improves moderation with contextual awareness, scalability, and customizable rules, and how you can build one with Next.js, MongoDB, Permit.io, and Gemini 1.5 Flash.
AI Content Moderator - Why and how to build one with Gemini 1.5
Gabriel L. Manor

Gabriel L. Manor

|
  • Share:

Online communities thrive on engagement, but if you ask any community moderator, they’ll tell you that ensuring a safe and welcoming environment is far from an easy task. With millions of users generating vast amounts of posts, comments, and discussions daily, platforms struggle to balance expression with their community’s own guidelines. Relying on manual moderation or simplistic keyword-based filters often leads to inconsistencies—allowing harmful content to slip through while mistakenly flagging innocent discussions.

This blog aims to explain how LLMs can be used for AI-driven content moderation and how community managers can use them to enforce custom, context-aware moderation rules that align with their community’s values.

We’ll also provide a step-by-step guide to building an AI-powered content moderation system using Next.js, MongoDB, Permit.io, and Gemini 1.5 Flash.

Why Build an AI-Community Moderator?

What’s wrong with current community moderation tools?

As mentioned previously, the challenge faced by community managers is quite overwhelming, as they need to moderate an ever-growing stream of user-generated content while maintaining fair, context-aware enforcement of community guidelines.

Traditional moderation methods—whether manual or rule-based—often fall short, either missing harmful content or over-policing legitimate discussions, and AI-assisted moderation might be a good solution.

Unlike static keyword filters, rigid rule-based systems, or the manual review of content by hundreds of moderators, AI can analyze community engagement in bulk, considering the intent and nuance behind the content. Using AI, human moderators can create customizable moderation policies that reflect the specific values and needs of individual communities, thus making their job much easier—especially from a scaling perspective.

Large online communities generate an immense volume of content every day. Human moderators alone cannot keep up, and rigid keyword-based filters often result in false positives or negatives.

Another issue to consider is the question of context and customization – As each community has unique values and moderation standards, there isn’t really a generic one-size-fits-all moderation tool that would fit any community out there.

How can AI help with this?

LLM-powered moderation tools can significantly improve traditional methods by handling large volumes of content efficiently, detecting harmful behavior with contextual awareness, and allowing for customizable enforcement of community rules.

Unlike basic keyword filters, AI can analyze entire conversations to understand nuance, reducing false positives and negatives. By automating moderation, these tools free up human moderators to focus on complex cases while ensuring enforcement remains fair and adaptable.

You Can Build this Yourself. It’s Easy too.

At first glance, building an AI-powered moderation system might sound complex—something only large platforms with dedicated AI teams can afford to develop. However, thanks to a couple of frameworks and tools, you can integrate AI-driven moderation into your own application with relative ease.

By combining Next.js for frontend and backend development, MongoDB for data storage, Permit.io for rule-based enforcement, and Gemini 1.5 Flash for AI-powered content analysis, we can create a scalable, customizable moderation system in just a few steps.

In the next section, we’ll walk through how to build a community-driven social platform with AI-powered moderation.

Building an AI-Driven Moderation System

In this tutorial, we’ll build a community-focused social media application that integrates LLM-powered content moderation with scalable policy enforcement tools. We’ll use:

  • Next.js for the frontend and backend of our application.
  • MongoDB as the database.
  • Permit.io handles fine-grained permission control, determining who can create moderation rules, submit posts, or take moderation actions.
  • Gemini 1.5 Flash to analyze content, generate actionable moderation filters, and enforce customizable community rules.

The complete code for this tutorial is available on GitHub.

Prerequisites

  • You should be familiar with Next.js and MongoDB.
  • Have an installation of docker on your system.

Setting up the project

I already created a starter template for the app so we solely focus on adding the core functionalities. We can clone the GitHub repo and install the packages with the following command:

git clone -b starter <https://github.com/Tammibriggs/we-conect-community.git>

cd we-conect-community

npm install

Now, when we start the app npm run dev we will see the following:

image.png

Next, for the sign-in and other functionalities to work create a .env.local in the root directory and supply the following environment variables:

MONGO_URL = <your-mongodb-uri>
BASE_URL = <http://localhost:3000>
SECRET_TOKEN = <your-jwt-secret-any-secret-will-work>

To get the MONGO_URL, go to the MongoDB website, sign-in, and navigate to the Create Project page.

image.png

Enter the name of your project, click Next, and then Create Project. After that, we will be taken to the Overview page of our new project.

image.png

Next, let's create a cluster. Click the Create button, then on the next page select a free cluster tier and at the bottom-right of the page, click on Create Deployment. We will see the following modal which we will use to configure connections to our cluster.

image.png

If we are ok with the username and generated password, click Create Database User > Choose a connection method > Drivers and we will be taken to the Connect step where will see our connection URL.

image.png

Copy the URL and supply it to the MONGO_URL in the .env.local file.

Next, let’s configure the network access to allow access from anywhere so we won’t encounter connection issues during development. In the sidebar, click on Network Access, then click on ADD IP ADDRESS on the displayed page.

image.png

In the modal, click on ALLOW ACCESS FROM ANYWHERE and click Confirm.

Now we can go over to our application, enter a username, and sign in. A dummy community will be automatically created and we will be given an admin role as the first signed-in user.

image.png

Defining Access Policies

In our community-focused social media app, we will be using Permit.io to determine which features users can access based on their role and to restrict privileges based on the violations of moderation rules. For this, we will be using the Attribute-Based Access control (ABAC) authorization model which gives us more control to define granular policies.

To get started first sign in to Permit.io and create a workspace.

image.png

Next, let’s define user attributes needed to create conditions for our ABAC policy. Navigate to the Directory page from the sidebar, click on Settings at the top-right of the page, then on the next page click on User Attributes from the sidebar.

image.png

Here, we will define the following attributes.

  • timed_out(Boolean): To determine if a user is timed out, to restrict access.
  • role(String): To determine the user’s role in a particular community.
  • violations_count(Number): To determine the number of times a user has violated the community’s moderation rules, in order to progressively restrict access.

Click on the Add Attribute button to add these attributes.

Next, navigate away from the settings page and click on Policy in the sidebar. On this page, we will add a new role that will be assigned as a default role for user in our app, create a resource that reflects the object we want to manage in our app (in our case, it’s “community”), along with the resource’s actions and the conditions that will trigger those actions.

In the Roles tab, click Add Role. Enter user as the name and click on Save. Next, in the Resources tab, click Create a Resource.

image.png

Enter community as the resource name and add a create-post and react action aside from the default actions, then click Save.

Next, in the Policy Editor tab, check the read action for the user role and click Save Changes.

image.png

Next, let’s create User Sets which will determine the privileges of users that match a specified condition. We will define three user sets, three user sets:

  • community admin: For community admin privileges
  • community member: For community member privileges.
  • timed out community member: For removing user posting privilege as a result of rule violation.

In the Policy Editor tab click on Create > ABAC User Set. First, create the community admin user set and define its conditions.

image.png

Enable its allowed actions and click Save Changes.

image.png

Next, do the following for others.

image.png

image.png

image.png

image.png

Integrating Permit into our application

There are several ways we can integrate Permit into our application, including using the Permit SDK and via the Cloud PDP (Policy Decision Point) or Container PDP. For this tutorial, we will be using the Container PDP.

To do this, first run the following commands to pull and start the Permit PDP container.

docker pull permitio/pdp-v2:latest

docker run -it \\\\
  -p 7766:7000 \\\\
  --env PDP_API_KEY=<YOUR_API_KEY> \\\\
  --env PDP_DEBUG=True \\\\
  permitio/pdp-v2:latest

To get the YOUR_PERMIT_API_KEY env, navigate back to the Permit dashboard, click on Projects in the sidebar, then click on the three dots at the top-right of the Development card and select Copy API KEY.

image.png

Now replace the env in the above docker run command and click Enter. With that, the Permit PDP now runs on http://localhost:7766. Awesome!

We need to also add the Permit API KEY along the URL in our .env.local file as we will need to initialize the Permit SDK.

PERMIT_IO_PDP_URL = <http://localhost:7766>
PERMIT_IO_API_KEY = <your-permit-api-key>

Now, to initialize Permit, in our Next.js project, first install the SDK using the following command:

npm install permitio

Then navigate to server/utils, create a permit.js file, and add the following lines of code:

const { Permit } = require("permitio");

const permit = new Permit({
  pdp: process.env.PERMIT_IO_PDP_URL,
  token: process.env.PERMIT_IO_API_KEY,
});

export default permit;

Automating Content Moderation with Community Rules

In this section, we will use Gemini 1.5 Flash to analyze community rules and generate actionable moderation rules, whose violations can be detected in submitted posts. We will enable community admins to define custom moderation rules tailored to their community values and block submitted posts that violate these rules or the generated moderation rule. Finally, we will limit community members' privileges based on rule violations, leveraging the power of Permit.

Let’s start by initializing the Gemini SDK. For that, we will need a Gemini API Key which can be gotten from the Google AI Studio.

image.png

In AI Studio, click Create API key, then in the modal that appears select an existing Google Cloud project and click Create API key in existing project. Copy the generated API key and add it to the .env.local file.

GEMINI_API_KEY = <your-gemini-api-key>

Next, install the Gemini SDK with the following command:

npm install @google/generative-ai

Then navigate to server/utils, create a gemini.js file and add the following lines of code:

import { GoogleGenerativeAI } from "@google/generative-ai";

const genAI = new GoogleGenerativeAI(process.env.GEMINI_API_KEY);

export default genAI;

Next, we let’s add newly signed-in users to Permit and assign them the created user role. Navigate to pages/api/auth/signin.js and import Permit:

import permit from "@/server/utils/permit";

Then, modify the singIn function to be similar to the following:

const signIn = async (req, res) => {
  try {
    const username = req.body.username;
    if (!username?.trim()) {
      return res.status(422).json({ message: '"username" is required' });
    }
    let user = await User.findOne({ username });
    if (!user) {
      // create user and assign resulting document to the user variable
      // After that check if the dummuy community document exist,
      // if it does add new user as a member else create the document and add new user as admin
      user = await User.create({ username });
      await permit.api.syncUser({
        key: user._id,
        role_assignments: [
          {
            role: "user",
            tenant: "default",
          },
        ],
      });

    ...

  } catch (err) {
    return res.status(500).json({ message: "Internal Server Error" });
  }
};

Note: To make sure that the admin user (first signed-in user) is also synched to Permit, go over to MongoDB delete the test database, and then open our app in a new tab as Session Storage is what’s used.

Generating Moderation Rules from Community Rules with Gemini

Now, let’s take the first step of utilizing Gemini for innovative content moderation. The first thing we will do is instruct Gemini to analyze created community rules and generate actionable moderation rules, then save the generated rules in the database.

Navigate to pages/api/communities/rules and the following imports and prompt at the top of the file, immediately after the imports.

import genAI from "@/server/utils/gemini";
import permit from "@/server/utils/permit";

const prompt = `
You are an AI assistant tasked with analyzing the provided social media community guidelines. These guidelines define acceptable behavior and content for users in the community.
Your goal is to identify actionable moderation rules specifically tailored for detecting violations in a post using the Gemini 1.5 flash language model (LLM) without requiring fine-tuning. A post consists only of:
- **Body**: The text content of the post.
- **Media**: Descriptive features derived from the post's image.
### Instructions:
- The generated moderation rules must be identifiable by analyzing **only** the post's body or media.
- Do **not** include moderation rules that depend on additional context or information such as:
- The community's core themes or topics.
- User account history or behavior.
- Post origin (e.g., location or author metadata).
- Community-specific information not present in the post body or media.
- Avoid vague terms like "irrelevant" or "off-topic." Instead, clearly define the type of content being restricted.
### Output Requirements:
- The generated moderation rules must be provided in JSON format.
- The top-level JSON structure must be a JSON Array.
- Each moderation rule must be a JSON Object within the Array that includes:
  - **A title** (maximum 40 characters).
  - **A description** (maximum 30 words).
- Titles must be structured so that adding the word "Block" at the beginning results in a grammatically correct title.
- Titles must be **direct, specific, and self-explanatory** without relying on the description to clarify their meaning.
- Focus only on violations explicitly described in the provided guidelines.
- Combine similar or overlapping rules into one concise and comprehensive rule.
- Do **not** include generic or overly broad titles.
### Example Output Format:
\\\\`\\\\`\\\\`json
[
 {
   "title": "Posts with Hate Speech",
   "description": "Block posts with language targeting groups based on race, religion, or gender."
 },
 {
    "title": "Posts with Graphic Violence",
   "description": "Block posts containing images with excessive gore or harm."
 }
]
\\\\`\\\\`\\\\`
### What to Avoid:
- **Bad Example:**
**Title:** Posts Promoting Illegal Events
**Why?** This requires context about the legality of events, which may not be fully apparent in the post content.
- **Bad Example:**
**Title:** Irrelevant Content
**Why?** The scope of what is considered "irrelevant" is unclear and not self-explanatory.
- **Bad Example:**
**Title:** Off-Topic Posts
**Why?** "Off-topic" is a vague term that does not specify the actual violation.
Now, analyze the following guidelines and generate the rules.
`;

The above prompt instructs Gemini to generate moderation rules whose violation can be detected by analyzing only the text and image of a submitted post.

Next, modify the createRule function responsible for creating for storing created community rules in the database to the following:

const createRule = async (req, res) => {
  try {
    const userId = req.userId;
    const { communityId, title, description } = req.body;

    const community = await Community.findById(communityId);
    if (!community) {
      return res.status(404).json({ message: "Community not found" });
    }

    const member = community.members.find(
      (member) => member.userId.toString() === userId
    );
    const permitted = await permit.check(
      {
        key: userId,
        attributes: {
          role: member.role,
        },
      },
      "create",
      "community"
    );
    if (!permitted) {
      return res.status(403).json({ message: "Unauthorized" });
    }

    community.rules.push({ title, description });

    const model = genAI.getGenerativeModel({
      model: "gemini-1.5-flash",
    });

    const result = await model.generateContent(
      `${prompt}\\\\n\\\\n
      ${title}:${description}`
    );
    const responseText = result.response.text();

    try {
      let cleanedText = responseText.replace(/```json|```/g, "").trim();
      const jsonResponse = JSON.parse(cleanedText);
      const generatedFilters = community.moderationFilters.generatedFilters;
      generatedFilters.options = [
        ...jsonResponse,
        ...(generatedFilters.options ? generatedFilters.options : []),
      ];
    } catch (err) {}

    await community.save();
    return res.status(200).json({ message: "ok" });
  } catch (err) {
    return res.status(500).json({ message: "Internal Server Error" });
  }
};

The modifications made to the function include checking if a user is permitted to carry out the create action using Permit, providing the prompt along with the community rule been created to Gemin 1.5 Flash, and storing the resulting moderation rules in the database.

To see this in action, go over to the app in the browser, click on Set up rules, and in the modal that appears select an Example Rule and click Create.

image.png

Now click Auto Moderation and scroll to the bottom in the modal that appears.

image.png

And here we have it—a list of generated moderation filters derived from the community rules. At the top of the modal, you will notice a Spam Filter, and you might be wondering why it is included. Well, in a practical moderation system, generative AI cannot address all aspects of moderation or may not be appropriate for certain tasks—for example, blocking posts if a user does not have a profile picture. In such cases, preset filters can be included to handle these areas. However, for this tutorial, the Spam Filter is primarily added to demonstrate how to timeout users who violate moderation rules using Permit.io.

Analyzing Submitted Posts against Moderation Rules and Restricting Permissions with Gimini and Permit

Now, let’s engineer Gemini to analyze and flag submitted posts that violate the generated moderation rules by returning the rules that were violated, then block flagged posts and also limit user permission for spam violations.

Navigate to pages/api/community-posts/index.js and add the following import:

import permit from "@/server/utils/permit";

Next, modify the createPost function to look like the following:

const createPost = async (req, res) => {
  try {
    const userId = req.userId;
    await runMiddleware(req, res, uploadMiddleware);
    const { content, communityId } = req.body;

    const community = await Community.findById(communityId);
    if (!community) {
      return res.status(404).json({ message: "Community not found" });
    }

    const memberIndex = community.members.findIndex(
      (member) => member.userId.toString() === userId
    );
    let member = community.members[memberIndex];
    const now = new Date();
    const restrictionEndTime = member?.restriction?.endTime
      ? new Date(member.restriction.endTime)
      : null;

    // Remove user restrication if the timeout has elapsed
    if (restrictionEndTime && now > restrictionEndTime) {
      member.restriction.endTime = undefined;
    }

    const permitted = await permit.check(
      {
        key: userId,
        attributes: {
          role: member.role,
          timed_out: !!member?.restriction?.endTime,
        },
      },
      "create-post",
      "community"
    );

    if (!permitted) {
      return res.status(403).json({ message: "Permission denied" });
    }

    ....

  } catch (err) {
    res.status(500).json({ messsage: "Internal Server Error" });
  }
};

The modification made to the above function includes using Permit to determine if a user can carry out the create-post action based on their role and whether they have been timed out and resetting the restriction end time when timeout has elapsed.

Next, navigate to server/utils/index.js and the following import:

import genAI from "./gemini";

Next, add the following lines of code:

// Function to construct the prompt from rules and post
function constructPostCheckPrompt(rules, postContent) {
  const prompt = `You are a content moderation assistant. Your task is to analyze a social media post and determine if it violates any of the provided moderation rules.

      **Moderation Rules:**

       ${JSON.stringify(rules)}

      **Post Analysis:**

      Analyze the following post, considering both its text content and any accompanying media (if applicable). For each moderation rule, determine if the post violates the rule.

      **Post Content:**

      ${JSON.stringify(postContent)}

      **Output:**

      Present your output in a JSON format as an array of objects, each object should represent the rule and the post's status. The object should have the following fields \\\\`rule_title\\\\` , \\\\`violation_status\\\\` (Boolean; \\\\`true\\\\` for violation, \\\\`false\\\\` otherwise), and  \\\\`reasoning\\\\`. For the violation status, set it to \\\\`true\\\\` if there's a clear violation; otherwise, it should be \\\\`false\\\\`. If the status is \\\\`true\\\\`, the reasoning field should contain a concise justification for the violation. If \\\\`false\\\\`, the reasoning should indicate why the rule was not violated.
      `;
  return prompt;
}

const mediaToParts = async (media) => {
  return [
    {
      inlineData: {
        data: Buffer.from(fs.readFileSync(media.path)).toString("base64"),
        mimeType: media.mimetype,
      },
    },
  ];
};

In the above code, the constructPostCheckPrompt function will be used to construct a prompt with the moderation rules and submitted post to instruct Gemini to analyze the submitted post and determine if it violates any of the provided moderation rules and then returns the rules that were violated. The mediaToParts function returns the structure for supplying an image to Gemini.

Next, to use this function, in the same file, modify the checkPostForViolations function to the following:

const checkPostForViolations = async (
  post,
  media,
  member = {},
  moderationFilters
) => {
  const { presets, generatedFilters } = moderationFilters;
  let violations = [];
  if (member.role !== "admin") {
    member.restriction = member.restriction ? member.restriction : {};
    if (presets.enabled) {
      const filters = presets.options;
      const presetViolations = await evaluatePresetsCriteria(post, filters);

      // Set the post status to rejected if any of the violated preset filters specifies a 'blockPosts' action
      // and set the restrication end time to the combined timeout of all violated filters
      if (Object.keys(presetViolations).length) {
        violations = Object.values(presetViolations).flat();
        const isBlockPost = filters.some(
          (filter) =>
            !!presetViolations[filter.name] &&
            filter.actions.includes("blockPost")
        );
        if (isBlockPost) {
          post.status = "rejected";
        }

        const combinedTimeout = filters.reduce((acc, filter) => {
          if (
            !!presetViolations[filter.name] &&
            filter.actions.includes("timeoutUser")
          ) {
            const timoutDuraction = getTimeoutDuration(
              filter.actionConfig.timeoutDuration
            ).getTime();

            return acc + timoutDuraction;
          }
        }, 0);

        if (combinedTimeout) {
          member.restriction.endTime = new Date(combinedTimeout);
        }
        member.restriction.violationsCount += 1;
        member.restriction.violations = violations;
      }
    }

    if (generatedFilters.enabled && generatedFilters.options?.length) {
      const model = genAI.getGenerativeModel({ model: "gemini-1.5-flash" });
      const rules = generatedFilters.options.map((filter) => ({
        title: filter.title,
        description: filter.description,
      }));
      const prompt = constructPostCheckPrompt(rules, post.content);
      let result;

      if (media) {
        const mediaParts = await mediaToParts(media);
        result = await model.generateContent([...mediaParts, prompt]);
      } else {
        result = await model.generateContent(prompt);
      }

      const response = await result.response;
      const responseText = response.text();
      try {
        let cleanedText = responseText.replace(/```json|```/g, "").trim();
        const moderationResult = JSON.parse(cleanedText);
        const generatedFilterViolations = moderationResult
          .filter((result) => result.violation_status === true)
          .map((result) => result.rule_title);
        violations = [...generatedFilterViolations, ...violations];
        if (generatedFilterViolations.length) {
          post.status = "rejected";
          member.restriction.violations = violations;
        }
      } catch (err) {
        if (media) {
          deleteFile(media.path);
        }
        throw new Error("Error occured while processing post");
      }
    }
  }

  return { post, member, violations };
};

The checkPostForViolations function is used to check if a post violates the spam filter or the generated moderation filter. If there was a violation update a post’s status to rejected, the community member’s violation count is increased and their violations are specified. The function returns the updated post and member object along with the list of violations. Later on in this tutorial, we will be using the violation count set here to dynamically update a users permission.

Now to test both the spam filter and with the generated moderation filters, navigate to the app in the browser and click on the Auto Moderation button. Then turn on the Presets, Block Spam Post, and Generated Filter switch. Also, check the Time out member checkbox and click done when the modal to set the amount of time that a user will be timed out appears.

image.png

After that, sign in with another username by editing the username field at the top-left of the page. This is done because the admin is exempted from the rules as done by the checkPostForViolations function above. Then try it out by trying to post anything that violates the moderation rules.

image (2).gif

Enabling Community Admin to Define Custom Moderation Filters

Custom moderation rules will enable admins to specify rules behind what is included in the community guidelines. Doing this involves providing Gemini with the custom rule to be added, asking it to evaluate the feasibility, and then returning evaluation results.

Navigate to pages/api/communities/generated-filters.js and add the following import and prompt at the top of the file.

import genAI from "@/server/utils/gemini";

const prompt = `
You are an AI assistant tasked with evaluating user-specified custom moderation filters. These filters describe criteria for content violations that users want to enforce in the platform.

Your goal is to determine whether a violation of the custom filter can be identified by the Gemini 1.5 Flash language model (LLM), without requiring fine-tuning, by analyzing only the body and media of a submitted post.

A post consists only of:
1. **Body**: The text content of the post.
2. **Media**: Descriptive features derived from the post's images.

### Instructions:
For each custom moderation filter provided:
1. **Evaluate Feasibility**: Determine if a violation of the filter can be identified by analyzing **only** the post's body or media.
2. **Return the Result in JSON format**:
- **If Fully Feasible**:
  - Include the title and description in the JSON output. The "partial_match" field should be an empty array ("[]").
  - **Descriptions MUST begin with the word "Block".**
- **If Partially Feasible (Too Broad but Partially Detectable)**:
  - Include the title, description, and a "partial_match" array. The "partial_match" array should list the specific parts of the filter that *can* be detected. The description should then explain how those parts are detected.
  - **Descriptions MUST begin with the word "Block".**
- **If Not Feasible**:
  - Provide a JSON object with:
    - An error message explaining why the violation cannot be identified based on the post's body or media.
    - A suggestion field containing advice on how to modify or clarify the filter to make it actionable.
3. description

### Output Format:

#### If Fully Feasible:
\\\\`\\\\`\\\\`json
{
  "title": "Filter Title",
  "description": "A Provide a clear and concise explanation (maximum 30 words) describing how violations of the filter are identified",
  "partial_match": []
}
\\\\`\\\\`\\\\`

#### If Partially Feasible:
\\\\`\\\\`\\\\`json
{
  "title": "Filter Title",
  "description": "A clear and concise explanation (maximum 30 words) describing how the DETECTABLE parts of the filter are identified",
  "partial_match": [
    "List of specific parts of the filter that CAN be detected"
  ]
}
\\\\`\\\\`\\\\`

#### If Not Feasible:
\\\\`\\\\`\\\\`json
{
  "title": "Filter Title",
  "error": "A concise explanation stating why the filter is not feasible.",
  "suggestion": "Advice on how to modify the filter to make it actionable (maximum 20 words)."
}
\\\\`\\\\`\\\\`

### Example Outputs:
**Fully Feasible Example**:
\\\\`\\\\`\\\\`json
{
  "title": "Posts with Hate Speech",
  "description": "Block posts containing text targeting groups based on race, religion, or gender."
}
\\\\`\\\\`\\\\`

\\\\`\\\\`\\\\`json
{
  "title": "Posts Containing Explicit Content",
  "description": "Block posts with text or images depicting nudity or explicit sexual acts."
}
\\\\`\\\\`\\\\`

**Partially Feasible Example**:
\\\\`\\\\`\\\\`json
{
  "title": "Posts Discussing Illegal Activities",
  "description": "Block posts that explicitly describe performing illegal acts or provide instructions for illegal activities."
}
\\\\`\\\\`\\\\`

\\\\`\\\\`\\\\`json
{
  "title": "Posts Containing Spam ",
  "description": "Block posts identified as unsolicited promotional content or advertising."
}
\\\\`\\\\`\\\\`

**Not Feasible Example**:
\\\\`\\\\`\\\\`json
{
  "title": "Posts from Suspicious Accounts",
  "error": "Filter requires user account history, which is unavailable in the post's body or media."
  "suggestion": "Define criteria in terms of detectable content, such as language indicating account spam in the post body."
}
\\\\`\\\\`\\\\`

\\\\`\\\\`\\\\`json
{
  "title": "Posts Violating Local Laws",
  "error": "Determining violations of local laws requires knowledge of the user's location, which is not available in post content."
  "suggestion": "Limit the filter to specific content like explicit references to illegal activities in the post body or images."
}
\\\\`\\\\`\\\\`

### Evaluation Guidelines:
1. **Focus on Content**: Only consider information within the post's body or media for determining feasibility.
2. **Avoid Vague Terms**: Reject filters that are overly generic (e.g., "offensive content") or undefined unless specific, detectable aspects can be identified and listed in "partial_match".

Now, evaluate the following custom moderation filter titles provided by users and return the appropriate result.
`;

The above prompt instructs Gemini to evaluate the custom rule provided by an admin to check if its violation can be detected only by analyzing a submitted post. if the custom rule is fully or partially feasible, it returns a title and description, else it returns an error. For partially feasible rules the description will contain only the specific part of the rule that is feasible.

Next, add the following function in the same file which is the backend API that will be used to send the request to Gemini.

const eveluateCustomFilter = async (req, res) => {
  try {
    const { filterTitle } = req.query;
    const model = genAI.getGenerativeModel({
      model: "gemini-1.5-flash",
    });
    const result = await model.generateContent(
      `${prompt}\\\\n\\\\n
      **Title**:${filterTitle}`
    );
    const responseText = result.response.text();
    let cleanedText = responseText.replace(/```json|```/g, "").trim();
    const jsonResponse = JSON.parse(cleanedText);
    return res.status(200).json(jsonResponse);
  } catch (err) {
    return res.status(500).json({ message: "Internal Server Error" });
  }
};

Next, modify the handler function to the following:

const handler = async (req, res) => {
  if (req.method === "POST") {
    const result = await verifyToken(req);
    if (result.isError) {
      return res.status(401).json({ message: result.message });
    }
    await saveCustomFilter(req, res);
  } else if (req.method === "DELETE") {
    const result = await verifyToken(req);
    if (result.isError) {
      return res.status(401).json({ message: result.message });
    }
    await deleteFilter(req, res);
  } else {
    res.status(405).json("Method Not Allowed");
  }
};

Then, head over to components/AutoMode.jsx and add the following useEffect.

useEffect(() => {
  if (customFilterTitle.length) {
    const eveluateCustomFilter = async () => {
      setIsEvaluatingFilter(true);
      try {
        const res = await axiosInstance.get(
          `/communities/generated-filters?filterTitle=${customFilterTitle}`
        );
        setFilterElavationResult(res.data);
      } catch {
      } finally {
        setIsEvaluatingFilter(false);
      }
    };
    const timer = setTimeout(eveluateCustomFilter, 1500);
    return () => {
      clearTimeout(timer);
    };
  } else {
    setFilterElavationResult({});
  }
}, [customFilterTitle]);

The above useEffect triggers a delayed API call to evaluate a custom filter whenever customFilterTitle state changes using setTimeout. The result is store evaluation is then stored in the filterEvaluationResult state.

Now, to try out the custom filter, navigate to the app in the browser, click on Auto Moderation, button, and enter the title of the filter. If the filter is not feasible an error will be displayed and otherwise the description of the action that will be taken.

image (3).gif

After saving the filter, sign in with another username and try to submit a post that violates the filter. It will be rejected.

image (4).gif

Dynamically Updating Group Memberships and Privileges Based on User Behavior

When a community member continuously violates rules we want to dynamically update their permissions. What we will implement is that when a user violates the moderation rules twice we will limit their ability to react to post, and when they violate it thrice we will limit their ability to post. This can be effectively done using Permit by making a few modifications.

Remember the violation count we mentioned previously which we increment by one any time a user violates a post. To implement the above behavior, we will pass that variable to permit as an attribute, update existing ABAC User Sets, and create new once to determine the user's permission based on that variable.

To pass the violation count variable to Permit, in the createPost function located at pages/api/community-posts/index.js modify the permitted variable to the following:

const permitted = await permit.check(
  {
    key: userId,
    attributes: {
      role: member.role,
      violations_count: member?.restriction?.violationsCount || 0,
      timed_out: !!member?.restriction?.endTime,
    },
  },
  "create-post",
  "community"
);

Now, navigate to the Permit dashboard, define the following ABAS User Set, and set the allowed actions.

image.png

image.png

image.png

Finally, update the community member User Set to the following:

image.png

With this, member permissions will be dynamically updated based on their violation count.

That’s It!

Traditional methods of content moderation often fall short, mostly because they aren’t scalable with the sheer size and amount of data that today’s social communities have to handle. Using AI-based moderation, platforms can enforce moderation rules with greater accuracy, scalability, and adaptability to community-specific values.

In this guide, we’ve explored how to integrate AI moderation into a Next.js application using MongoDB, Permit.io, and Gemini 1.5 Flash, demonstrating how these tools can streamline and improve the moderation process.

Got questions? Join our Slack community, where hundreds of developers are building and implementing authorization into their applications.

Written by

Gabriel L. Manor

Gabriel L. Manor

Full-Stack Software Technical Leader | Security, JavaScript, DevRel, OPA | Writer and Public Speaker

Test in minutes, go to prod in days.

Get Started Now

Join our Community

2301 Members

Get support from our experts, Learn from fellow devs

Join Permit's Slack