Tuesday, April 9, 2024

ASSOCIATION RULE IN MACHINE LEARNING/PYTHON/ARTIFICIAL INTELLIGENCE

Association rule 

  • Rule Evaluation Metrics
  • Applications of Association Rule Learning
  • Advantages of Association Rule Mining
  • Disadvantages of Association Rule Mining

Association rule learning falls under the realm of unsupervised learning and primarily aims to uncover meaningful connections or associations between items in a dataset. The core objective is to identify interesting relations among various variables within a database. These association rules in machine learning essentially depict the frequency of occurrence of a chosen item or set of items within a transaction.

Market basket analysis is a well-known application of association rule mining, therefore, it and apriori algorithms are also known as algorithms for market basket analysis. This technique explores relationships among products typically bought together. Consider a trip to a supermarket where items that tend to be purchased simultaneously are strategically placed close to each other. This arrangement is based on observed purchasing patterns; it's aimed at potentially encouraging additional purchases when customers pick up one item, suggesting that they might be interested in related products.

The significance of association rule mining aka ais algorithm in data mining extends beyond retail settings. It finds applications in diverse fields like web usage mining, continuous production, and more. Essentially, it's utilized to unveil correlations and patterns in various datasets, enabling businesses and analysts to make informed decisions based on observed associations among items or variables.

Real-World Example for Association Rule

Let’s look at a real-world example, Emily owned a grocery store, and she wanted to know her customers’ buying behaviors better. She has transaction data; therefore, she uses association rule mining to uncover the insights of the data.

By applying the apriori algorithm association rule mining, she discovered that customers purchasing organic vegetables often buy organic fruits in their carts, which suggests a preference for healthy, natural foods. Similarly, those buying pasta sauce were likely to purchase pasta, indicating a taste for Italian cuisine.

With these insights, Emily strategically arranged her store layout, placing complementary items like bread and cheese closer together to encourage additional purchases. She also designed targeted promotions, offering discounts on items frequently bought together such as chips and soda or coffee and pastries, to entice customers to buy more.

With the help of association rule mining, Emily was able to transform her grocery store into a thriving hub of community activity

Types of Association Rule Algorithm

  • Apriori
  • Eclat
  • F-P Growth Algorithm

The Apriori algorithm operates on transactional databases to derive association rules from frequent datasets. It efficiently computes item sets through breadth-first search and Hash trees. Its primary application lies in market basket analysis, although it's also applicable in fields like healthcare for discerning drug reactions among patients.

Eclat, short for Equivalence Class Transformation, utilizes depth-first search to uncover frequent item sets in transaction databases. Notably, it offers faster performance compared to the Apriori algorithm.

The F-P Growth algorithm, abbreviated from "frequent pattern," represents an enhanced version of the Apriori algorithm. It structures the database into a tree-like format known as a frequent pattern tree, to identify the most common patterns present in the data.

Working of Association rule or algorithms for association rule mining

First, we need to know the basic definitions before defining the rule.

Support count (σ)  - represents how often a certain set of objects appears.

Frequent item – The items in this set have a minimum support level of 1.

Association rule – a two-item set X and a two-item set Y expressed as an implication expression.

Rule Evaluation Metrics

"Support" is the term used to describe the proportion of completed transactions that include elements from both the {X} and {Y} rule segments. This metric gauges the frequency of occurrence of a set of items together as a percentage of the entire transaction volume.

Support (σ)=(X+Y)/total

This refers to the proportion of transactions that include both items X and Y together.

Confidence (C) – The degree of confidence quantifies the strength of the relationship between two components in a dataset. This is below the equation: The ratio of all transactions including all items in set {X} to all transactions including all items in set {Y} is the planned distribution.

Conf(X=>Y)Supp(X∪Y)/Supp(X)

The expression mentioned evaluates the frequency of occurrence of each item in Y within transactions that also include items from set X.

Lift (L) – Lift acts as a metric indicating the degree of association between two items, taking into account their frequencies in the dataset. Assuming that X and Y are separate collections of items, we divide the rule's confidence by the projected confidence to find lift in the context of the rule X => Y. This expected confidence is derived by dividing the confidence by the frequency of {Y}.

lift(X=>Y)=Conf(X=>Y)/(Supp(Y))

When the lift value is around 1, it suggests that X and Y generally appear together as expected. A lift greater than 1 signifies that their co-occurrence is more frequent than anticipated, while a value less than 1 indicates a lower-than-expected co-occurrence. Higher lift values denote a stronger association between the items.

The Use of Association Rule Learning Frameworks
In the field of machine learning, the association rule has several uses. Here are a few of the most common uses:
One common use of association rule mining is market basket analysis, which is commonly used by large merchants throughout the world. Retailers may optimize product placement and marketing tactics by using this strategy to find correlations between different goods.
When it comes to medical diagnostics, association rule mining is king when it comes to disease prediction and patient risk assessment. Healthcare providers can gain a better understanding of illness probability and adjust treatment strategies appropriately by studying trends in patient data.
Association rule mining methods improve protein sequence analysis by revealing correlations between amino acids and facilitating the prediction of protein production patterns. Because of this, it is easier to create synthetic proteins with the properties needed for use in a wide range of scientific and commercial contexts.
Besides healthcare and retail, association rule mining has many other uses, such as loss-leader analysis and catalog design. Business marketing and decision-making can benefit greatly from the identification of product customer-behavior correlations.

Benefits of AR Mining Associations

Relationship insight: it unearths previously unseen relationships, dependencies, and trends in massive datasets, illuminating consumer habits, market basket analysis, and other areas of study.
Association rules are simple and easy to grasp, allowing non-technical people to gain useful insights even when faced with complex problems.
Retail, e-commerce, and other industries can benefit from its rapid discovery of co-occurring products or events, which in turn helps with suggestions and cross-selling techniques.
Helps in decision-making by giving data on product correlations, which in turn facilitates targeted marketing and better inventory management.
Big data settings can benefit from association rule mining due to the scalability of algorithms such as Apriori and FP-Growth, which can manage enormous datasets.
Association rules are simple and easy to grasp, allowing non-technical people to gain useful insights even when faced with complex problems.
Retail, e-commerce, and other industries can benefit from its rapid discovery of co-occurring products or events, which in turn helps with suggestions and cross-selling techniques.
Helps in decision-making by giving data on product correlations, which in turn facilitates targeted marketing and better inventory management.
Big data settings can benefit from association rule mining due to the scalability of algorithms such as Apriori and FP-Growth, which can manage enormous datasets.
Application flexibility: it provides insights into many sorts of relationships within data sets and is relevant in diverse industries such as healthcare, finance, telecommunications, and more.
The basis for further analysis: association rule mining serves as a foundation for more complex data mining techniques and can aid in feature selection or dimensionality reduction for machine learning tasks.
Extraction of actionable patterns: enables the extraction of actionable patterns, helps businesses optimize processes, improve sales strategies, or enhance customer experiences.

Disadvantages of Association Rule Mining

Complexity in computing: The computational cost of creating rules for massive datasets may become evident when dealing with a high number of objects or transactions. Its complexity could limit its ability to grow.

Generation of numerous rules: the algorithm can produce a vast number of rules, including many that might be trivial, redundant, or not actionable. Sorting through this volume to find meaningful associations can be challenging.

Handling of noise and spurious correlations: association rule mining might pick up spurious correlations or associations caused by noise or rare events, leading to unreliable or misleading rules.

Dependency on threshold settings: the quality and relevance of the rules heavily depend on the thresholds set for support and confidence. Determining these thresholds can be subjective and may impact the usefulness of the discovered rules.

Inability to handle continuous variables: association rule mining typically works with categorical or binary data, making it less suitable for continuous variables without preprocessing.

Limited to binary relationships: it primarily discovers binary associations between items or features, potentially missing more complex relationships or interactions between multiple variables.

Assumptions of independence: association rule mining assumes independence between items, which might not hold in all cases, particularly when dealing with sequential or time-related data.

Contextual information exclusion: it might not consider contextual information or temporal relationships between items, limiting its applicability in certain scenarios.

Summary

Association rule mining is a powerful data mining method employed to unveil concealed patterns, connections, and associations within extensive datasets. It delves into relationships and correlations between different items or variables, finding applications in diverse fields such as market basket analysis, recommendation systems, and numerous other domains. The method generates rules (that is if-then statements) that show the co-occurrence or dependency between items in the data. While it offers insights into item associations and supports decision-making processes, it has limitations such as computational complexity, generation of numbers rules, susceptibility to noise, and assumptions of independence between items. Despite these limitations, association rule mining remains valuable for uncovering valuable associations within datasets, aiding in business strategies, and providing insights into consumer behavior.

Python Code

below is the association/Apriori algorithm code in Python: - 



Tuesday, March 5, 2024

INTRODUCTION TO THE PRE PROCESSOR IN C

Introduction to the pre-processor in C

  • Introduction to pre-processor 
  • Pre-preprocessor 
  • Macro substitution 
  • File inclusion directives 
  • Conditional compilation 

The compilation does not begin with the C Preprocessor; rather, it is an independent procedure. A simple text replacement tool that tells the compiler to do any necessary preparation before compilation is a C preprocessor.


A macro processor called the C preprocessor is automatically used by the C compiler to make certain alterations before compiling your program. Because it can define macros—shorthand for more complicated constructions—the program is termed a macro processor
.

There are four independent features available to you in the C preprocessor:

  •       The header files are a part of this. You can replace these declaration files in your software.
  •       The C preprocessor can replace actual macros in the program with designed macros with the use of macro extensions. Any element of C code can have its definition shortened by a macro.
  •        Conditional collection and precise preprocessing instructions enable the program components to be included or excluded based on different scenarios.
  •       Line management when you use a program to merge or reorganize source files into an intermediate file that is then compiled, you can use line control to tell the compiler where the source line originally came from.

 

Preprocessor directives

Beginning with the hash symbol (#), all preprocessor commands go here. For readability, preprocessor directives must begin with the first nonblank character in the first column. This is a complete guide to the most important preprocessor commands –



Macro substitution

The name and text that should be changed in a macro are specified by the #define directive. The preprocessor simply replaces the macro's name with the replacement text in the line of code where the macro is created.

Now we will try the C program


File inclusion directives
  1. The document contains Directories used to include user-defined header files in C programs.
  2. The file-inclusive directory looks for the header file inside the same directory if the path is left blank.
  3. The first step in file-inclusive directives is #include.
  4. By giving its path, you can add a certain header file to the current scope.
  5. Instead of using triangular brackets to include user-defined header files, we employ "Double Quote".
  6. It instructs the compiler to incorporate each named file.

Conditional compilation

Conditional Compilation: We can choose not to compile code that does not match specific requirements or to only compile code that does by using conditional compilation directives.

  1. #ifdef: This is the simplest type of conditional directive. This type of block is known as a conditional group. The controlled text will appear in the preprocessor output if the macro name is defined. The controlled text will include preprocessing directives inside a conditional. Their implementation is reliant on the conditional's success. You can stack them in layers, but they have to be fully nested. Put simply, '#endif' always matches the nearest '#ifdef' (or '#ifndef') or '#if'. Furthermore, you cannot create and terminate a conditional group in different files.ss

Syntax:

#ifdef MACRO

    controlled text

#endif /* macroname */

INTRODUCTION TO POINTERS IN C

Introduction to pointers in C
  • DIFFERENT ASPECTS OF USING POINTERS IN OTHER METHODS IN C
  • C POINTERS OPERATORS 
  • ADVANTAGES 
  • USES
  • NULL POINTER
  • COMPLEX POINTER 

Going back to what we discussed before, a pointer is a variable in C that stores the address of a location in memory. To access and modify data stored in a variable's memory, a pointer is a useful tool.

One of the most interesting and unusual aspects of C is its pointers. The tongue becomes more flexible and potent as a result. The advice can seem complicated and challenging at first, but if you get it, trust me—C opens up a world of opportunities.

When a variable is declared in a program, the system specifies where in memory the value should be stored. According to the software up above, we can get this place's address.

Assume for the purposes of argument that the system has reserved 80F in memory.

 

Int n=10

Int p=&n

 

Declaring a pointer

 

In C, a pointer can be declared using the * (letter symbol). Also known as an indirect pointer, it is used to dereference a pointer.

 

  1. int *a;//pointer to int  
  2. char *c;//pointer to char

the help of * (indirection operator), we can print the value of pointer variable p.

Let's see the pointer example as explained in the above figure.



 

Now we will understand different aspects of using pointers in other methods using C


 C Pointer Operators

Two distinct kinds of pointer operators exist in C:

  •    operator and
  •     & operato

Operators in C have been examined in detail in distinct sections.

The operand's memory address is returned by the & operator. As an illustration,

a = &b;

The memory address of variable b will be saved in variable a.

The counterpart of & is the * operator. The value found at the specified location is returned by this operator.

For instance, if the variable's memory location is included in a, then the code,

c is equal to *a;

will cause the variable's value to be stored in c.

let us understand the pointer in C with a C program


Advantages of using pointer

·       When dealing with arrays and C structures, pointers are the way to go.
It is simpler to pass functions as parameters to other functions when pointers are used because they permit references to functions.
• A C function can change its arguments sent to it using Pointers.
• The program's size and runtime are reduced thereafter.
• It makes it possible for C's dynamic memory management to work.

Usage of pointers

The C language has several uses for pointers.

1) Allocating memory dynamically

We may utilize the pointer-based malloc() and calloc() methods in the C language to dynamically allocate memory.

2) Structures, Functions, and Arrays

In C, pointers are often utilized in arrays, functions, and structures. It increases performance and decreases code.

NULL Pointer

The NULL pointer is a pointer that is only NULL and has no value assigned to it. If you declare the pointer without an address to provide, you can set its value to NULL. It will provide a better approach.

Int *p=NULL;

Most of the time in libraries of it may have value 0

Let us try one C program to swap two numbers using 3rd variable


Reading complex pointers

Several things must be taken into consideration while reading the complex pointers in C. Let’s see the precedence and associativity of the operators that are used regarding pointers.

  • Here in the above table, please mark

    • The bracket operator, denoted by the symbol (), is used to declare and define the function.
    • An array subscript operator is used here: This is known as the pointer operator.
    • A pointer's name serves as its identifier. There is never going to be anything more important than this.
    • Kind of information: A pointer's data type identifies the sort of variable it is pointing to. It also includes other modifiers such as signed int and long.
How to read the pointer: int (*p)[10].
  • To interpret the pointer, we must recognize that () and [] are equally precedential. Consequently, we must consider their associativity. The precedence is with (), as the associativity is climbing from left to right.

    The pointer name (identifier) p and the pointer operation * have the same precedence inside the() bracket. Due to their right-to-left associativity, p has precedence over *, with * coming in second.

    As the data type comes first, assign [] the third priority. The pointer will therefore look something like this:


The pointer will be read as p is a pointer to an array of integers of size 10.

Example

How to read the following pointer?


It is read as

Here, "p" designates a function that accepts a void pointer as its second parameter and a two-dimensional array of integers as its first, with an integer as the return type of the letter.

INTRODUCTION TO FUNCTION IN C LANGUAGE

Introduction to function in C

  • NEED OF FUNCTION
  • ADVANTAGE
  • TYPES OF FUNCTION ASPECTS
  • DIFFERENT ASPECTS OF FUNCTION CALLING
  • C LIBRARY FUNCTION 
  • CALL BY VALUE
  • CALL BY REFERENCE
  • RECURSION
  • RECURSIVE FUNCTION 
  • MEMORY ALLOCATION OF RECURSIVE FUNCTION 

The basic building block of any C program is functions. A function is a group of statements that receives input, processes it, and then outputs the results. Their symbol is the curly bracket ({}). The ability to call functions repeatedly in C programming allows for more modularity and reuse. This means that you can call the function several times by simply putting in new arguments, saving you from having to write the same code for each input. Putting the code inside a function is one way.

Why do we need a function in C

Because functions offer so many benefits to the developer, we require them in C programming as well as other programming languages. The following are a few main advantages of employing functions:

  •  Minimizes redundancy and permits reusability
  •  Creates a modular code
  • ·Offers abstraction features
  • ·The software becomes simple to use and comprehend.
  • ·Divides a complex program into manageable chunks

Advantage of function in C

 Programming time can be saved by using functions instead of repeatedly writing the same logic or code.

·    You can call C functions from anywhere in your program, and you can do it as often as you'd like.

·      A large C program that is split up into several functions is easy to manage.

·      The main accomplishment of C functions is their reusability.

·      In C applications, however, calling functions is wasteful by nature.

We may classify function aspects into three categories.

The C function has three components.

 Statement of purpose A function must be defined globally in C for the compiler to understand its name, parameters, and return type.

 Call to function Within the program, functions can be invoked from any location. Function declaration and function calling cannot be different in the argument list. The number of functions that are declared in the function declaration must be passed.

Definition of function the actual assertions that need to be performed are contained in it. When the function is called, control is applied to the most crucial feature. It is important to note that the function can only return a single value at this point. 

Syntax of creating function in C

Reutn_type function_name (data_type parameter)

{

\\ code to be executed

}

Types of function

In C programming, functions come in two varieties:

Numerous methods, including scanf(), printf(), gets(), puts(), ceil(), and floor(), are specified in the C header files. These features are included in the library.

A user-defined function in C is one that the programmer creates for their own usage in the future. Large program code is optimized and made simpler by it. 


 Return value

A C function's return value isn't always guaranteed. If there is no requirement for the function to return any value, use void as the return type. To demonstrate this idea, let's look at a simple C function that outputs nothing.

Example without return value:

Void hello()

{

Printf(“hello c”);

}

Any data type, including char, long, and int, can be used to return any value from the function. The return type of a function is determined by the value that it is expected to return.

This is a simple C function that returns an integer value when given an integer parameter.

Example with return value

Int get()

{

Return (value);

}

 

In the above example we can define any type of data type in the return type whether it is integer or any thing but if we have to pass the float value we have to define first the float and then we can return the value.

 For ex

 

Float get()

{

Return 10.2;

}

Now we have to call the function and get the value.

 

There are different aspects of function-calling

 

A function can accept or reject an argument. It can return a value or not. Based on the above, there are four types of function calls:

 

      A function that does not require parameters
Yields no value in return, as well as one that does
Process that requests parameters but yields no output
Execute a program with parameters and display the outcomes

 

let us understand this by an example of a function without argument and return value.

 

 

Now let's understand this for function without argument and with return value

 



Let us understand this by with function and with return value

 


 printf("\nEnter two numbers:");  


C library functions

 

C comes with a set of functions that are stored in a library. With the help of these capabilities, you may accomplish certain activities more easily. The (printf) library function, for instance, allows you to output to the console.

Compiler designers create library functions. Several header files that conclude in. h are used to define the functions that comprise the C standard library. We must include the library functions defined in these header files for our software to use them.

For example, we need to include stdio. h in our application to use library functions like (printf) and (scanf). The library's standard input/print functions are all contained in this header file.

 

Let us better understand this by the table given below



 

Call by value in C

 

There are two methods to pass the value into the function

1)      call by value

2)      call by reference

 

 1) call by value 





  • Transforming the value of the real parameters into their formal counterparts is the essence of the call-by-value technique. Another way of putting it is that the value method call is where the function call makes use of the variable's value.
  • A call-by-value method does not allow the use of formal parameters to modify the value of actual parameters.
  • Separate memory is allocated for the two sorts of parameters because, in a call-by-value, the value of the real parameter is copied to the formal parameter.
  • In function calls, parameters are considered real, although in function definitions, they are considered formal.


Let us better understand the concept of call by value in a  C program

 



 Call by reference in C

  • In a comparison call, the address of the variable is actually passed as an argument to the function call.
  • Their addresses are included, therefore the parameters are passed.
  • Memory is allotted in the same way for formal and actual parameters by reference. All function operations use the value stored in the real parameter address; the updated value is likewise recorded at the same address.

 

Let us try a C program having call by reference 

 





 Difference between call by value and call by reference

 


 

Recursion in C language

 

When a function calls itself to solve a lesser problem, it is called recursion. When a function calls itself, we say that it is recursive. Another name for this type of call is "recursive call.".

A recursion is a way of invoking itself again. However, to end the loop, you must declare the condition. Recursive code is both shorter and more complex than iterative code, which is lengthier and easier to grasp.

Recursion works well for projects with linked subtasks, albeit it isn't suitable for every task.

For instance, sorting, finding, and navigation difficulties may all be tackled with recursion.

Recursion is usually less efficient than iterative methods as the function call is always unnecessary.

Any problem with a recursive solution can have iterative solutions. However, for some problems, such as factorization, the Tower of Hanoi, and the Fibonacci sequence, recursion is the best approach.

 

Let us understand recursion by a simple C program





Recursive function

 

To complete an operation, a recursive function breaks it down into smaller stages. One way a subtask might perform a function is by adhering to an end condition. Once the function returns the final result, the recursion is ended.

The recursive scenario depicts a function that calls itself to perform a subtask, whereas the basic example describes a function that does not iterate. This format works with all recursive functions.

 

The pseudocode for writing any recursive function is given below.


The pseudocode

 

Let us now understand this by a C program having Fibonacci series

 


 

Memory allocation of recursive function

 

With every iteration of this method, a fresh memory copy is produced. After data is returned by the process, the copy is removed from memory. Since the stack holds all of the variables and other items defined inside a function, each recursive call maintains its own stack.

Every time a value is returned by the related function, the stack is cleared. Recursion becomes extremely difficult when one considers the tracking and resolution of the values of each recursive call. This makes stack management and monitoring the values of the variables declared there very important.

To have a better understanding of memory allocation in recursive functions, let's examine the following example




Featured Post

ASSOCIATION RULE IN MACHINE LEARNING/PYTHON/ARTIFICIAL INTELLIGENCE

Association rule   Rule Evaluation Metrics Applications of Association Rule Learning Advantages of Association Rule Mining Disadvantages of ...

Popular