当前位置：首页 > news >正文

Python爬虫使用实例-wallpaper

news 2025/8/26 7:12:00

1_/ 排雷避坑

🥝 中文乱码问题

print(requests.get(url=url,headers=headers).text)

在这里插入图片描述

出现中文乱码

原因分析：

<meta charset="gbk" />

解决方法：
法一：

response = requests.get(url=url,headers=headers)
response.encoding = response.apparent_encoding # 自动转码, 防止中文乱码
print(response.text)

法二：

print(requests.get(url=url,headers=headers).content.decode('gbk'))

2_/ 数据来源

css解析
在这里插入图片描述

for li in lis:href = li.css('a::attr(href)').get()title = li.css('b::text').get()print(href, title)

在这里插入图片描述

删掉标题为空的那一张图

在这里插入图片描述
获取图片url

有的网站，保存的数据是裂开的图片，可能是因为这个参数：
在这里插入图片描述

3_/ 正则处理

处理图片url和标题的时候用了re模块

电脑壁纸
通过匹配非数字字符并在遇到数字时截断字符串

title1 = selector1.css('.photo .photo-pic img::attr(title)').get()
modified_title = re.split(r'\d', title1, 1)[0].strip()

re.split(r'\d', title, 1)将 title 字符串按第一个数字进行分割。返回的列表的第一个元素就是数字前面的部分。strip() 去掉字符串首尾的空白字符。

url图片路径替换，因为从点开图片到达的那个页面无法得到的图片路径还是html页面，不是https://····.jpg，所以替换成另一个可以获取到的页面。

https://sj.zol.com.cn/bizhi/detail_{num1}_{num2}.html

正则替换修改为

https://app.zol.com.cn/bizhi/detail_{num1}.html

例如 https://sj.zol.com.cn/bizhi/detail_12901_139948.html 转换为 https://app.zol.com.cn/bizhi/detail_12901.html .

# https://sj.zol.com.cn/bizhi/detail_12901_139948.html
url = "https://sj.zol.com.cn/bizhi/detail_12901_139948.html"
pattern = r'https://sj\.zol\.com\.cn/bizhi/detail_(\d+)_\d+\.html'
replacement = r'https://app.zol.com.cn/bizhi/detail_\1.html'
new_url = re.sub(pattern, replacement, url)
print(url,new_url)

4_/ 电脑壁纸

🥝 单线程单页

适用于当页面和第一页

# python单线程爬取高清4k壁纸图片
import os
import re
import requests
import parsel
url = 'https://pic.netbian.com/4kmeinv/' # 请求地址
# 模拟伪装
headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.139 Safari/537.36'}
# response = requests.get(url=url,headers=headers)
# response.encoding = response.apparent_encoding # 自动转码, 防止中文乱码
# print(response.text)
html_data = requests.get(url=url,headers=headers).content.decode('gbk')
# print(html_data)
selector = parsel.Selector(html_data)
lis = selector.css('.slist li')
for li in lis:#href = li.css('a::attr(href)').get()title = li.css('b::text').get()if title:href = 'https://pic.netbian.com' + li.css('a::attr(href)').get()response = requests.get(url=href, headers=headers)#print(href, title)# 这里只是获取页面# img_content = requests.get(url=href, headers=headers).content# 不可行, 都是同一张图 https://pic.netbian.com/uploads/allimg/230813/221347-16919360273e05.jpghref = 'https://pic.netbian.com' + li.css('a::attr(href)').get()response1 = requests.get(url=href, headers=headers).content.decode('gbk')selector1 = parsel.Selector(response1)# 若要标题乱码，此处可不解码# response1 = requests.get(url=href, headers=headers)# selector1 = parsel.Selector(response1.text)# img_url = selector1.css('.slist li img::attr(src)').get()# 这一步错了, 要去href页面找img_url, 这是在原来的url页面找了img_url = 'https://pic.netbian.com' + selector1.css('.photo .photo-pic img::attr(src)').get()img_content = requests.get(url=img_url,headers=headers).content# 顺便更新一下title, 因为原来的是半截的, 不全title1 = selector1.css('.photo .photo-pic img::attr(title)').get()modified_title = re.split(r'\d', title1, 1)[0].strip()with open('img\\'+modified_title+'.jpg',mode='wb') as f:f.write(img_content)#print(href, title)print('正在保存：', modified_title, img_url)

🥝 单线程多page

适用于从第二页开始的多页

# python单线程爬取高清4k壁纸图片
import os
import re
import time
import requests
import parsel
# url的规律
# https://pic.netbian.com/new/index.html
# https://pic.netbian.com/new/index_1.html
# https://pic.netbian.com/new/index_2.html
# ...
start_time = time.time()
for page in range(2,10):print(f'--------- 正在爬取第{page}的内容 ----------')url = f'https://pic.netbian.com/4kmeinv/index_{page}.html'  # 请求地址# 模拟伪装headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.139 Safari/537.36'}# response = requests.get(url=url,headers=headers)# response.encoding = response.apparent_encoding # 自动转码, 防止中文乱码# print(response.text)html_data = requests.get(url=url, headers=headers).content.decode('gbk')# print(html_data)selector = parsel.Selector(html_data)lis = selector.css('.slist li')for li in lis:# href = li.css('a::attr(href)').get()title = li.css('b::text').get()if title:href = 'https://pic.netbian.com' + li.css('a::attr(href)').get()response = requests.get(url=href, headers=headers)# print(href, title)# 这里只是获取页面# img_content = requests.get(url=href, headers=headers).content# 不可行, 都是同一张图 https://pic.netbian.com/uploads/allimg/230813/221347-16919360273e05.jpghref = 'https://pic.netbian.com' + li.css('a::attr(href)').get()response1 = requests.get(url=href, headers=headers).content.decode('gbk')selector1 = parsel.Selector(response1)# 若要标题乱码，此处可不解码# response1 = requests.get(url=href, headers=headers)# selector1 = parsel.Selector(response1.text)# img_url = selector1.css('.slist li img::attr(src)').get()# 这一步错了, 要去href页面找img_url, 这是在原来的url页面找了img_url = 'https://pic.netbian.com' + selector1.css('.photo .photo-pic img::attr(src)').get()img_content = requests.get(url=img_url, headers=headers).content# 顺便更新一下title, 因为原来的是半截的, 不全title1 = selector1.css('.photo .photo-pic img::attr(title)').get()modified_title = re.split(r'\d', title1, 1)[0].strip()with open('img\\' + modified_title + '.jpg', mode='wb') as f:f.write(img_content)# print(href, title)print('正在保存：', modified_title, img_url)stop_time = time.time()
print(f'耗时：{int(stop_time)-int(start_time)}秒')

运行效果：

在这里插入图片描述

🥝 多线程多页

# python多线程爬取高清4k壁纸图片
import os
import re
import time
import requests
import parsel
import concurrent.futuresdef get_img(url):# 模拟伪装headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.139 Safari/537.36'}# response = requests.get(url=url,headers=headers)# response.encoding = response.apparent_encoding # 自动转码, 防止中文乱码# print(response.text)html_data = requests.get(url=url, headers=headers).content.decode('gbk')# print(html_data)selector = parsel.Selector(html_data)lis = selector.css('.slist li')for li in lis:# href = li.css('a::attr(href)').get()title = li.css('b::text').get()if title:href = 'https://pic.netbian.com' + li.css('a::attr(href)').get()response = requests.get(url=href, headers=headers)# print(href, title)# 这里只是获取页面# img_content = requests.get(url=href, headers=headers).content# 不可行, 都是同一张图 https://pic.netbian.com/uploads/allimg/230813/221347-16919360273e05.jpghref = 'https://pic.netbian.com' + li.css('a::attr(href)').get()response1 = requests.get(url=href, headers=headers).content.decode('gbk')selector1 = parsel.Selector(response1)# 若要标题乱码，此处可不解码# response1 = requests.get(url=href, headers=headers)# selector1 = parsel.Selector(response1.text)# img_url = selector1.css('.slist li img::attr(src)').get()# 这一步错了, 要去href页面找img_url, 这是在原来的url页面找了img_url = 'https://pic.netbian.com' + selector1.css('.photo .photo-pic img::attr(src)').get()img_content = requests.get(url=img_url, headers=headers).content# 顺便更新一下title, 因为原来的是半截的, 不全title1 = selector1.css('.photo .photo-pic img::attr(title)').get()modified_title = re.split(r'\d', title1, 1)[0].strip()img_folder = 'img1\\'if not os.path.exists(img_folder):os.makedirs(img_folder)with open(img_folder + modified_title + '.jpg', mode='wb') as f:f.write(img_content)# print(href, title)print('正在保存：', modified_title, img_url)
def main(url):get_img(url)start_time = time.time()
executor = concurrent.futures.ThreadPoolExecutor(max_workers=5)
for page in range(2, 12):print(f'--------- 正在爬取第{page}的内容 ----------')url = f'https://pic.netbian.com/4kmeinv/index_{page}.html'  # 请求地址executor.submit(main, url)
executor.shutdown()
stop_time = time.time()
print(f'耗时：{int(stop_time) - int(start_time)}秒')

5_/ 手机壁纸

类似地，另一个网站，图片集合多页，点开之后里面有多张图片
先试图获取外部的，再获取里面的，然后2个一起

🥝 单线程单页0

import os
import re
import requests
import parsel
url = 'https://sj.zol.com.cn/bizhi/5/' # 请求地址
# 模拟伪装
headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.139 Safari/537.36'}
# response = requests.get(url=url,headers=headers)
# response.encoding = response.apparent_encoding # 自动转码, 防止中文乱码
# print(response.text)
response = requests.get(url=url,headers=headers)
#print(response.text)
selector = parsel.Selector(response.text)
lis = selector.css('.pic-list2 li')
#img_name=1
for li in lis:#href = li.css('a::attr(href)').get()title = li.css('.pic img::attr(title)').get()#href = li.css('.pic img::attr(src)').get()#print(title, href)if title:#href = 'https://sj.zol.com.cn' +li.css('a::attr(href)').get()# https://sj.zol.com.cn/bizhi/detail_12901_139948.html# https://app.zol.com.cn/bizhi/detail_12901_139948.html#p1#href = 'https://app.zol.com.cn' + li.css('a::attr(href)').get() + '#p1'href=li.css('img::attr(src)').get()#print(href, title)#href = 'https://app.zol.com.cn' + li.css('a::attr(href)').get() + '#p1'#response1 = requests.get(url=href, headers=headers).content.decode('utf-8')#selector1 = parsel.Selector(response1)#img_url=selector1.css('.gallery li img::attr(src)').get()#print(img_url)# 这里只是获取页面img_content = requests.get(url=href, headers=headers).content# 不可行, 都是同一张图 https://pic.netbian.com/uploads/allimg/230813/221347-16919360273e05.jpg# https://sj.zol.com.cn/bizhi/detail_12901_139948.html# https://app.zol.com.cn/bizhi/detail_12901_139948.html#p1#href= selector1.css('.photo-list-box li::attr(href)').get()#href = 'https://app.zol.com.cn' +  + '#p1'#response2 = requests.get(url=href, headers=headers)#selector2 = parsel.Selector(response2.text)#print(href)# 若要标题乱码，此处可不解码# response1 = requests.get(url=href, headers=headers)# selector1 = parsel.Selector(response1.text)# img_url = selector1.css('.slist li img::attr(src)').get()# 这一步错了, 要去href页面找img_url, 这是在原来的url页面找了#img_url = selector1.css('.gallery img::attr(src)').get()#img_content = requests.get(url=img_url, headers=headers).content#print(img_url)# 顺便更新一下title, 因为原来的是半截的, 不全# title1 = selector1.css('.photo .photo-pic img::attr(title)').get()img_folder = 'img3\\'if not os.path.exists(img_folder):os.makedirs(img_folder)with open(img_folder + title + '.jpg', mode='wb') as f:f.write(img_content)# print(href, title)print('正在保存：', title, href)#img_name += 1

🥝 单线程单页1

# 下载子页面全部
import os
import requests
import parselurl = 'https://app.zol.com.cn/bizhi/detail_12901.html' # 请求地址
# 模拟伪装
headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.139 Safari/537.36'}
response = requests.get(url=url,headers=headers)
selector = parsel.Selector(response.text)
lis = selector.css('.album-list li')
i = 0
for li in lis:# Get all img elements within the current liimg_tags = li.css('img::attr(src)').getall()  # This gets all the img src attributesfor href in img_tags:  # Iterate over all img src attributesimg_content = requests.get(url=href, headers=headers).contentimg_folder = 'img4\\'if not os.path.exists(img_folder):os.makedirs(img_folder)with open(img_folder + str(i) + '.jpg', mode='wb') as f:f.write(img_content)# print(href, i)print('正在保存：', i, href)i += 1  # Increment i for each image saved

🥝 单线程单页

import os
import re
import requests
import parselurl = 'https://sj.zol.com.cn/bizhi/5/' # 请求地址
# 模拟伪装
headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.139 Safari/537.36'}
response = requests.get(url=url,headers=headers)
#print(response.text)
selector = parsel.Selector(response.text)
#lis = selector.css('.pic-list2 li')
# 筛除包含的底部 3个 猜你喜欢
lis=selector.css('.pic-list2 .photo-list-padding')
for li in lis:#href = li.css('a::attr(href)').get()title = li.css('.pic img::attr(title)').get()href = li.css('a::attr(href)').get()#print(title, href)# https://sj.zol.com.cn/bizhi/detail_12901_139948.html#url = "https://sj.zol.com.cn/bizhi/detail_12901_139948.html"pattern = r'/bizhi/detail_(\d+)_\d+\.html'replacement = r'https://app.zol.com.cn/bizhi/detail_\1.html'new_url = re.sub(pattern, replacement, href)#print(href, new_url)#url = 'https://app.zol.com.cn/bizhi/detail_12901.html'  # 请求地址# 模拟伪装headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.139 Safari/537.36'}response = requests.get(url=new_url, headers=headers)selector = parsel.Selector(response.text)lis1 = selector.css('.album-list li')i = 0for li1 in lis1:# Get all img elements within the current liimg_tags = li1.css('img::attr(src)').getall()  # This gets all the img src attributesfor href in img_tags:  # Iterate over all img src attributesimg_content = requests.get(url=href, headers=headers).contentimg_folder = 'img5\\'if not os.path.exists(img_folder):os.makedirs(img_folder)with open(img_folder + title+'_'+str(i) + '.jpg', mode='wb') as f:f.write(img_content)# print(href, i)print('正在保存：',title+'_'+str(i), href)i += 1  # Increment i for each image saved

在这里插入图片描述

🥝 单线程多页

import os
import re
import requests
import parselfor page in range(1,3):print(f'--------- 正在爬取第{page}的内容 ----------')if page==1:url = 'https://sj.zol.com.cn/bizhi/5/'  # 请求地址else:url = f'https://sj.zol.com.cn/bizhi/5/{page}.html'  # 请求地址# 模拟伪装headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.139 Safari/537.36'}response = requests.get(url=url, headers=headers)# print(response.text)selector = parsel.Selector(response.text)# lis = selector.css('.pic-list2 li')# 筛除包含的底部 3个 猜你喜欢lis = selector.css('.pic-list2 .photo-list-padding')for li in lis:# href = li.css('a::attr(href)').get()title = li.css('.pic img::attr(title)').get()href = li.css('a::attr(href)').get()# print(title, href)# https://sj.zol.com.cn/bizhi/detail_12901_139948.html# url = "https://sj.zol.com.cn/bizhi/detail_12901_139948.html"pattern = r'/bizhi/detail_(\d+)_\d+\.html'replacement = r'https://app.zol.com.cn/bizhi/detail_\1.html'new_url = re.sub(pattern, replacement, href)# print(href, new_url)# url = 'https://app.zol.com.cn/bizhi/detail_12901.html'  # 请求地址# 模拟伪装headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.139 Safari/537.36'}response = requests.get(url=new_url, headers=headers)selector = parsel.Selector(response.text)lis1 = selector.css('.album-list li')i = 0for li1 in lis1:# Get all img elements within the current liimg_tags = li1.css('img::attr(src)').getall()  # This gets all the img src attributesfor href in img_tags:  # Iterate over all img src attributesimg_content = requests.get(url=href, headers=headers).contentimg_folder = 'img6\\'if not os.path.exists(img_folder):os.makedirs(img_folder)with open(img_folder + title + '_' + str(i) + '.jpg', mode='wb') as f:f.write(img_content)# print(href, i)print('正在保存：', title + '_' + str(i), href)i += 1  # Increment i for each image saved

🥝 多线程多页

import os
import re
import time
import requests
import parsel
import concurrent.futuresdef get_imgs(url):# 模拟伪装headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.139 Safari/537.36'}response = requests.get(url=url, headers=headers)# print(response.text)selector = parsel.Selector(response.text)# lis = selector.css('.pic-list2 li')# 筛除包含的底部 3个 猜你喜欢lis = selector.css('.pic-list2 .photo-list-padding')for li in lis:# href = li.css('a::attr(href)').get()title = li.css('.pic img::attr(title)').get()href = li.css('a::attr(href)').get()# print(title, href)# https://sj.zol.com.cn/bizhi/detail_12901_139948.html# url = "https://sj.zol.com.cn/bizhi/detail_12901_139948.html"pattern = r'/bizhi/detail_(\d+)_\d+\.html'replacement = r'https://app.zol.com.cn/bizhi/detail_\1.html'new_url = re.sub(pattern, replacement, href)# print(href, new_url)# url = 'https://app.zol.com.cn/bizhi/detail_12901.html'  # 请求地址# 模拟伪装headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.139 Safari/537.36'}response = requests.get(url=new_url, headers=headers)selector = parsel.Selector(response.text)lis1 = selector.css('.album-list li')i = 0for li1 in lis1:# Get all img elements within the current liimg_tags = li1.css('img::attr(src)').getall()  # This gets all the img src attributesfor href in img_tags:  # Iterate over all img src attributesimg_content = requests.get(url=href, headers=headers).contentimg_folder = 'img7\\'if not os.path.exists(img_folder):os.makedirs(img_folder)with open(img_folder + title + '_' + str(i) + '.jpg', mode='wb') as f:f.write(img_content)# print(href, i)print('正在保存：', title + '_' + str(i), href)i += 1  # Increment i for each image saveddef main(url):get_imgs(url)start_time = time.time()
executor = concurrent.futures.ThreadPoolExecutor(max_workers=4)
for page in range(1, 9):#print(f'--------- 正在爬取第{page}的内容 ----------')if page == 1:url = 'https://sj.zol.com.cn/bizhi/5/'  # 请求地址else:url = f'https://sj.zol.com.cn/bizhi/5/{page}.html'  # 请求地址executor.submit(main, url)
executor.shutdown()
stop_time = time.time()
print(f'耗时：{int(stop_time) - int(start_time)}秒')

Python爬虫使用实例-wallpaper

1/ 排雷避坑 🥝 中文乱码问题 print(requests.get(urlurl,headersheaders).text)出现中文乱码原因分析： <meta charset"gbk" />解决方法： 法一： response requests.get(urlurl,headersheaders) response.en…...

编程日记 2024/9/14 22:23:00

探索Go语言中的随机数生成、矩阵运算与数独验证

1. Go中的随机数生成在许多编程任务中，随机数的生成是不可或缺的。Go语言通过 math/rand 包提供了伪随机数生成方式。伪随机数由种子(seed)决定，如果种子相同，生成的数列也会相同。为了确保每次程序运行时产生不同的随机数，我们…...

编程日记 2024/9/14 22:19:47

无线安全(WiFi)

免责声明:本文仅做分享!!! 目录 WEP简介 WPA简介安全类型密钥交换 PMK PTK 4次握手 WPA攻击原理网卡选购攻击姿态 1-暴力破解脚本工具字典 2-Airgeddon 破解 3-KRACK漏洞 4-Rough AP 攻击 5-wifi钓鱼 6-wifite 其他 WEP简介 WEP是WiredEquivalentPri…...

编程日记 2024/9/14 22:18:45

牛客练习赛128：Cidoai的平均数对（背包dp）

题目描述给定 nnn 对数 (ai,bi)(a_i,b_i)(ai,bi) 和参数 kkk，你需要选出一些对使得在满足 bib_ibi 的平均值不超过 kkk 的同时，aia_iai 的和最大，求出这个最大值。输入描述: 第一行两个整数分别表示 n,kn,kn,k。接下来 nnn 行&…...

编程日记 2024/9/14 22:17:44

Python世界：简易地址簿增删查改算法实践

Python世界：简易地址簿增删查改算法实践任务背景编码思路代码实现本文小结任务背景该任务来自简明Python教程中迈出下一步一章的问题： 编写一款你自己的命令行地址簿程序， 你可以用它浏览、添加、编辑、删除或搜索你的联系人&#xff…...

编程日记 2024/9/14 22:16:42

网络安全-intigriti-0422-XSS-Challenge Write-up

目录一、环境二、解题 2.1看源码一、环境 Intigriti April Challenge 二、解题要求：弹出域名就算成功 2.1看源码我们看到marge方法，肯定是原型链污染题目接的是传参，我们可控的点在于qs.config和qs.settings，这两个可…...

编程日记 2024/9/14 22:14:37

Debian Linux 11 使用crash

文章目录前言一、环境安装1.1 安装debug package1.2 安装crash 二、使用crash 前言 # cat /etc/os-release PRETTY_NAME"Debian GNU/Linux 11 (bullseye)" NAME"Debian GNU/Linux" VERSION_ID"11" VERSION"11 (bullseye)" VERSION_C…...

编程日记 2024/9/14 22:10:33

python列表 — 按顺序找出b表中比a表多出的元素

目录一、功能描述二、适用场景三、代码实现一、功能描述有a、b两个列表，a列表有3个元素；b列表有7个元素。b列表多出的一个元素可能在随机的位置，在不影响其他元素的情况下，找到b列表多出的那四个元素，并按照在…...

编程日记 2024/9/14 22:07:21

如何使用Python创建目录或文件路径列表

在 Python 中，创建目录或生成文件路径列表通常涉及使用 os、os.path 或 pathlib 模块。下面是一些常见的任务和方法，用于在 Python 中创建目录或获取文件路径列表。问题背景在初始阶段的 Python 学习过程中，可能遇到这样的问题&#xff1a…...

编程日记 2024/9/14 22:06:21

领夹麦克风哪个品牌好，哪种领夹麦性价比高，无线麦克风推荐

在音频录制需求日益多样化的今天，无线领夹麦克风作为提升音质的关键设备，其重要性不言而喻。市场上鱼龙混杂，假冒伪劣、以次充好的现象屡见不鲜。这些产品往往以低价吸引消费者，却在音质、稳定性、耐用性等方面大打折扣&#xff0…...

编程日记 2024/9/14 22:05:19

文章目录二.新增菜品1.图片上传2.具体新增菜品二.新增菜品 1.图片上传这里采用了阿里云oss对象存储服务 application.yml alioss:endpoint: ${sky.alioss.endpoint}access-key-id: ${sky.alioss.access-key-id}access-key-secret: ${sky.alioss.access-key-secret}bucket…...

编程日记 2024/9/14 22:04:19

什么是卷积层、池化层、BN层，有什么作用？

什么是卷积层、池化层、BN层，有什么作用？ 卷积层池化层BN层卷积层定义： 卷积层是CNN中的核心组件，它通过卷积运算对输入数据进行特征提取。卷积层由多个卷积单元组成，每个卷积单元的参数通过反向传播算法优化得到。…...

编程日记 2024/9/14 22:02:16

[学习笔记]《CSAPP》深入理解计算机系统 - Chapter 4 处理器体系结构Chapter 5 优化程序性能

总结一些第四章和第五章的一些关键信息 Chapter 4 处理器体系结构将处理组织成阶段 Chapter 5 优化程序性能 Chapter 4 处理器体系结构在硬件中，寄存器直接将它的输入和输出线连接到电路的其他盆。在机器级变成中，寄存器代表的是 CPU 中为数不多的可寻…...

编程日记 2024/9/14 21:59:13

案例分享|我是这样转型做数据产品经理的？

本文为才聚学员投稿的原创作品，现在才聚正面向专业项目管理者征集“项目管理实战案例”原创文章，被采纳即可获得丰厚稿酬，欢迎大家关注公众号踊跃投稿。如您有意向投稿，可将稿件投递给我们。故事介绍三段故事，讲…...

编程日记 2024/9/14 21:57:11

ffmpeg面向对象-rtsp拉流相关对象

目录 1.AVFormatContext和FFFormatContext类。1.1 概述1.2 构造函数1.3 oopc的继承实现 2. AVInputFormat 类。2.1 多态的实现 3.所用设计模式3.1模板模式3.2 工厂模式？ 3.3 rtsp拉流建链 4.this指针5.小结6.rtsp拉流流程 1.AVFormatContext和FFFormatContext类。 …...

编程日记 2024/9/14 21:55:09

feign client发送Post请求，发送对象参数，服务端接收不到正确参数报错排查

记一次feignclient发送请求服务端接收不到正确参数排查服务端代码： Operation(summary "Create team")PostMapping("post")RequiresPermissions("team:add")public RestResponse addTeam(Valid Team team) {this.teamService.crea…...

编程日记 2024/9/14 21:54:08

Hadoop林子雨安装

文章目录 hadoop安装教程注意事项： hadoop安装教程链接: 安装教程注意事项： 可以先安装ububtu增强功能，完成共享粘贴板和共享文件夹 ubuntu增强功能 2.这里就可以使用共享文件夹或者在虚拟机浏览器，用微信文件传输助手传文…...

编程日记 2024/9/14 21:52:05

Springboot项目总结

1.为了调用写在其他包里面的类的方法但是不使用new来实现调用这个类里面的方法，这个时候我们就需要将这个类注入到ioc容器里面，通过ioc容器来实现自动生成一个对象。对ioc容器的理解：自动将一个对象实现new. 考察了and 和 or组合使用&…...

编程日记 2024/9/14 21:49:03

目标检测从入门到精通——数据增强方法总结

以下是YOLO系列算法（从YOLOv1到YOLOv7）中使用的数据增强方法的总结，包括每种方法的数学原理、相关论文以及对应的YOLO版本。 YOLO系列数据增强方法总结数据增强方法数学原理相关论文图像缩放将输入图像缩放到固定大小（如448x44…...

编程日记 2024/9/14 21:47:00

SQL server 的异常处理一个SQL异常如何不影响其他SQL执行

在 SQL Server 中，存储过程中的 SQL 语句是顺序执行的。如果其中任何一个 SQL 语句遇到了错误或异常，那么默认情况下，这个错误会导致整个事务（如果有的话）回滚，并且存储过程会立即停止执行，不会…...

编程日记 2024/9/14 21:44:56

Go 语言接口详解

Go 语言接口详解核心概念接口定义在 Go 语言中，接口是一种抽象类型，它定义了一组方法的集合： // 定义接口 type Shape interface {Area() float64Perimeter() float64 } 接口实现 Go 接口的实现是隐式的： // 矩形结构体…...

编程新知 2025/8/13 18:32:28

条件运算符

C中的三目运算符（也称条件运算符，英文：ternary operator）是一种简洁的条件选择语句，语法如下： 条件表达式 ? 表达式1 : 表达式2• 如果“条件表达式”为true，则整个表达式的结果为“表达式1”…...

编程新知 2025/7/23 11:19:45

江苏艾立泰跨国资源接力：废料变黄金的绿色供应链革命

在华东塑料包装行业面临限塑令深度调整的背景下，江苏艾立泰以一场跨国资源接力的创新实践，重新定义了绿色供应链的边界。跨国回收网络：废料变黄金的全球棋局艾立泰在欧洲、东南亚建立再生塑料回收点，将海外废弃包装箱通过标准…...

编程新知 2025/8/3 22:02:59

selenium学习实战【Python爬虫】

selenium学习实战【Python爬虫】文章目录 selenium学习实战【Python爬虫】一、声明二、学习目标三、安装依赖3.1 安装selenium库3.2 安装浏览器驱动3.2.1 查看Edge版本3.2.2 驱动安装四、代码讲解4.1 配置浏览器4.2 加载更多4.3 寻找内容4.4 完整代码五、报告文件爬取5.1 提…...

编程新知 2025/8/5 6:59:06

蓝桥杯3498 01串的熵

问题描述对于一个长度为 23333333的 01 串, 如果其信息熵为 11625907.5798， 且 0 出现次数比 1 少, 那么这个 01 串中 0 出现了多少次? #include<iostream> #include<cmath> using namespace std;int n 23333333;int main() {//枚举 0 出现的次数//因…...

编程新知 2025/8/22 3:24:23

Scrapy-Redis分布式爬虫架构的可扩展性与容错性增强：基于微服务与容器化的解决方案

在大数据时代，海量数据的采集与处理成为企业和研究机构获取信息的关键环节。Scrapy-Redis作为一种经典的分布式爬虫架构，在处理大规模数据抓取任务时展现出强大的能力。然而，随着业务规模的不断扩大和数据抓取需求的日益复杂，传统…...

编程新知 2025/8/22 5:19:23

土建施工员考试：建筑施工技术重点知识有哪些？

《管理实务》是土建施工员考试中侧重实操应用与管理能力的科目，核心考查施工组织、质量安全、进度成本等现场管理要点。以下是结合考试大纲与高频考点整理的重点内容，附学习方向和应试技巧： 一、施工组织与进度管理核心目标： 规…...

编程新知 2025/7/9 1:00:40

FOPLP vs CoWoS

以下是 FOPLP（Fan-out panel-level packaging 扇出型面板级封装）与 CoWoS（Chip on Wafer on Substrate）两种先进封装技术的详细对比分析，涵盖技术原理、性能、成本、应用场景及市场趋势等维度： 一、技术原…...

编程新知 2025/7/6 23:27:14

LUA+Reids实现库存秒杀预扣减记录流水以及自己的思考

目录 lua脚本记录流水记录流水的作用流水什么时候删除我们在做库存扣减的时候，显示基于Lua脚本和Redis实现的预扣减这样可以在秒杀扣减的时候保证操作的原子性和高效性 lua脚本 // ... 已有代码 ...Overridepublic InventoryResponse decrease(Inventor…...

编程新知 2025/7/7 12:27:56

华为云Flexus+DeepSeek征文 | 基于Dify构建具备联网搜索能力的知识库问答助手

华为云FlexusDeepSeek征文 | 基于Dify构建具备联网搜索能力的知识库问答助手一、构建知识库问答助手引言二、构建知识库问答助手环境2.1 基于FlexusX实例的Dify平台2.2 基于MaaS的模型API商用服务三、构建知识库问答助手实战3.1 配置Dify环境3.2 创建知识库问答助手3.3 使用知…...

编程新知 2025/8/22 5:03:57

1/ 排雷避坑

🥝 中文乱码问题

2/ 数据来源

3/ 正则处理

4/ 电脑壁纸

🥝 单线程单页

🥝 单线程多page

🥝 多线程多页

5/ 手机壁纸

🥝 单线程单页0

🥝 单线程单页1

🥝 单线程单页

🥝 单线程多页

🥝 多线程多页

相关文章：

1_/ 排雷避坑

2_/ 数据来源

3_/ 正则处理

4_/ 电脑壁纸

5_/ 手机壁纸