Certified: AI for Product Management
I’m happy to share that I’ve obtained a new certification: AI for Product Management from Pendo.io!
Verify: https://www.credly.com/badges/10e7acce-1f49-49f4-b348-33e3568f7c29/public_url
Dotnet 9 preview-1 JsonSerializerOptions features
Document summarizer using Open AI on LangChain
For the sake of a use case, the intention of this example is to summarize a resume. Google Colab was used for this experiment but you can use your own IDE/environment. Just make sure you have the necessary prerequicites set.
- Since I am using Google Colab, I will be uploading the sample input file to the “Files” store. You can choose to use your local disk storage if you are on a laptop/pc.
- While you can use any file format, I am using a pdf file as input so we have to convert the pdf to readable text. I will be using pdfx library to read and extract text data.
- A meaningful prompt and setting context will be done
- Access Open API API
- Receive response and show the summarized text.
Below are the instructions and code:
- Install Prerequisites
I am using pdfx library to read the pdf document. You can use any provider here.
pip install pdfx
We use OpenAI using LangChain so install the required dependencies
pip install --upgrade langchain langchain-openai tiktoken
2. Load Job Description (JD)
import pdfx
pdf = pdfx.PDFx('sample_data/Sample Resume.pdf')
resume_content = pdf.get_text();
3. Make the resume content compatible for LLM Chain
from langchain.docstore.document import Document
from langchain.text_splitter import CharacterTextSplitter
model_name = "gpt-3.5-turbo"
text_splitter = CharacterTextSplitter.from_tiktoken_encoder(
model_name=model_name
)
# Caution: This code doesn't bother about large documents so chunking/tokenization is out of scope of this example
texts = text_splitter.split_text(resume_content)
docs = [Document(page_content=t) for t in texts]
4. Initialize OpenAI
from langchain_openai import ChatOpenAI
from google.colab import userdata
# Open AI API key is stored in the Secrets vault in Google Colab
OPENAI_API_KEY = userdata.get('openai_api_key')
llm = ChatOpenAI(
temperature=0,
openai_api_key=OPENAI_API_KEY,
model_name=model_name)
5. Define summarization prompt
from langchain.prompts import PromptTemplate
# Use the prompt "List the skills mentioned in below resume:" to list the skills alone
prompt_template = """Summarize below resume:
{text}
"""
prompt = PromptTemplate(template = prompt_template, input_variables=["text"])
6. Summarization
from langchain.chains.combine_documents.stuff import StuffDocumentsChain
from langchain.chains import LLMChain
llm_chain = LLMChain(llm=llm, prompt=prompt)
#I am ignoring the chunking aspect
chain = StuffDocumentsChain(llm_chain=llm_chain, document_variable_name="text")
summary = chain.run(docs)
7. Print Summary
import textwrap
print(textwrap.fill(summary, width=100))
Example output: <name-removed> is a passionate researcher with a focus on cutting-edge technology such as Machine Learning, Computer Vision, and Deep Learning. He has experience as an Associate Data Scientist-Trainee at Lincode Labs and as an AI/ML Intern. His roles included data collection and cleaning, extending code modules, experimenting and deploying machine/deep learning models, and handling end-to-end processes. He has worked on various projects related to object detection, OCR detection, and classification in the manufacturing domain. <name-removed> has a Bachelor’s degree in Computer Science and skills in Python, machine learning platforms, frameworks, libraries, and tools. He has also worked on academic and personal projects related to border security systems and house price prediction.
Jira Fundamentals Badge earned
Slide deck and video recording for the Cloud Security Session
Below is the slide deck used for the session “Securing the Skies: Navigating Cloud Security Challenges and Beyond” for FDPPI
Webinar: Securing the Skies- Navigating Cloud Security Challenges and Beyond
My upcoming webinar on “Securing the Skies- Navigating Cloud Security Challenges and Beyond” for FDPPI on July 26, 2023 at 7PM IST.
In this talk, I will explore the major topics surrounding cloud security, covering various scenarios, risk challenges, multi-cloud security, and mitigation strategies. Delving into cloud security patterns and best practices, attendees will gain a deep understanding of how to safeguard digital assets in the cloud. The discussion will also extend to API security and the latest developments in the realm of cloud security, equipping participants with valuable insights and practical knowledge to protect their data in a connected world. Don’t miss this opportunity to discover effective ways to defend against cloud-related threats and embrace the immense potential of cloud computing securely. Join me, and let us make this session an interactive one.
Miro Academy achievements
Microsoft Learn Learning path completed – Develop Generative AI solutions with Azure OpenAI Service
Vlog: Create an Azure Open AI instance
This is a screen recording demonstrating how to create a basic Open AI instance in Azure Portal.