在亚马逊云科技上安全、合规地创建AI大模型训练基础设施并开发AI应用服务
项目简介:
小李哥将继续每天介绍一个基于亚马逊云科技AWS云计算平台的全球前沿AI技术解决方案,帮助大家快速了解国际上最热门的云计算平台亚马逊云科技AWS AI最佳实践,并应用到自己的日常工作里。
本次介绍的是如何在亚马逊云科技利用Service Catalog服务创建和管理包含AI大模型的应用产品,并通过权限管理基于员工的身份职责限制所能访问的云资源,并创建SageMaker机器学习托管服务并在该服务上训练和部署大模型,通过VPC endpoint节点私密、安全的加载模型文件和模型容器镜像。本架构设计全部采用了云原生Serverless架构,提供可扩展和安全的AI解决方案。本方案的解决方案架构图如下:

方案所需基础知识
什么是 Amazon SageMaker?
Amazon SageMaker 是亚马逊云科技提供的一站式机器学习服务,旨在帮助开发者和数据科学家轻松构建、训练和部署机器学习模型。SageMaker 提供了从数据准备、模型训练到模型部署的全流程工具,使用户能够高效地在云端实现机器学习项目。
什么是亚马逊云科技 Service Catalog?
亚马逊云科技 Service Catalog 是一项服务,旨在帮助企业创建、管理和分发经过批准的云服务集合。通过 Service Catalog,企业可以集中管理已批准的资源和配置,确保开发团队在使用云服务时遵循组织的最佳实践和合规要求。用户可以从预定义的产品目录中选择所需的服务,简化了资源部署的过程,并减少了因配置错误导致的风险。
利用 SageMaker 构建 AI 服务的安全合规好处
符合企业合规性要求:
使用 SageMaker 构建 AI 服务时,可以通过 Service Catalog 预先定义和管理符合公司合规标准的配置模板,确保所有的 AI 模型和资源部署都遵循组织的安全政策和行业法规,如 GDPR 或 HIPAA。
数据安全性:
SageMaker 提供了端到端的数据加密选项,包括在数据存储和传输中的加密,确保敏感数据在整个 AI 模型生命周期中的安全性。同时可以利用VPC endpoint节点,私密安全的访问S3中的数据,加载ECR镜像库中保存的AI模型镜像容器。
访问控制和监控:
通过与亚马逊云科技的身份和访问管理(IAM)集成,可以细粒度地控制谁可以访问和操作 SageMaker 中的资源。再结合 CloudTrail 和 CloudWatch 等监控工具,企业可以实时跟踪和审计所有的操作,确保透明度和安全性。
本方案包括的内容
1. 通过VPC Endpoint节点,私有访问S3中的模型文件
2. 创建亚马逊云科技Service Catalog资源组,统一创建、管理用户的云服务产品。
3. 作为Service Catalog的使用用户创建一个SageMaker机器学习训练计算实例
项目搭建具体步骤:
1. 登录亚马逊云科技控制台,进入无服务器计算服务Lambda,创建一个Lambda函数“SageMakerBuild”,复制以下代码,用于创建SageMaker Jupyter Notebook,训练AI大模型。
import json
import boto3
import requests
import botocore
import time
import base64## Request Status ##
global ReqStatusdef CFTFailedResponse(event, status, message):print("Inside CFTFailedResponse")responseBody = {'Status': status,'Reason': message,'PhysicalResourceId': event['ServiceToken'],'StackId': event['StackId'],'RequestId': event['RequestId'],'LogicalResourceId': event['LogicalResourceId']}headers={'content-type':'','content-length':str(len(json.dumps(responseBody))) } print('Response = ' + json.dumps(responseBody))try: req=requests.put(event['ResponseURL'], data=json.dumps(responseBody),headers=headers)print("delete_respond_cloudformation res "+str(req)) except Exception as e:print("Failed to send cf response {}".format(e))def CFTSuccessResponse(event, status, data=None):responseBody = {'Status': status,'Reason': 'See the details in CloudWatch Log Stream','PhysicalResourceId': event['ServiceToken'],'StackId': event['StackId'],'RequestId': event['RequestId'],'LogicalResourceId': event['LogicalResourceId'],'Data': data}headers={'content-type':'','content-length':str(len(json.dumps(responseBody))) } print('Response = ' + json.dumps(responseBody))#print(event)try: req=requests.put(event['ResponseURL'], data=json.dumps(responseBody),headers=headers)except Exception as e:print("Failed to send cf response {}".format(e))def lambda_handler(event, context):ReqStatus = "SUCCESS"print("Event:")print(event)client = boto3.client('sagemaker')ec2client = boto3.client('ec2')data = {}if event['RequestType'] == 'Create':try:## Value Intialization from CFT ##project_name = event['ResourceProperties']['ProjectName']kmsKeyId = event['ResourceProperties']['KmsKeyId']Tags = event['ResourceProperties']['Tags']env_name = event['ResourceProperties']['ENVName']subnet_name = event['ResourceProperties']['Subnet']security_group_name = event['ResourceProperties']['SecurityGroupName']input_dict = {}input_dict['NotebookInstanceName'] = event['ResourceProperties']['NotebookInstanceName']input_dict['InstanceType'] = event['ResourceProperties']['NotebookInstanceType']input_dict['Tags'] = event['ResourceProperties']['Tags']input_dict['DirectInternetAccess'] = event['ResourceProperties']['DirectInternetAccess']input_dict['RootAccess'] = event['ResourceProperties']['RootAccess']input_dict['VolumeSizeInGB'] = int(event['ResourceProperties']['VolumeSizeInGB'])input_dict['RoleArn'] = event['ResourceProperties']['RoleArn']input_dict['LifecycleConfigName'] = event['ResourceProperties']['LifecycleConfigName']except Exception as e:print(e)ReqStatus = "FAILED"message = "Parameter Error: "+str(e)CFTFailedResponse(event, "FAILED", message)if ReqStatus == "FAILED":return None;print("Validating Environment name: "+env_name)print("Subnet Id Fetching.....")try:## Sagemaker Subnet ##subnetName = env_name+"-ResourceSubnet"print(subnetName)response = ec2client.describe_subnets(Filters=[{'Name': 'tag:Name','Values': [subnet_name]},])#print(response)subnetId = response['Subnets'][0]['SubnetId']input_dict['SubnetId'] = subnetIdprint("Desc sg done!!")except Exception as e:print(e)ReqStatus = "FAILED"message = " Project Name is invalid - Subnet Error: "+str(e)CFTFailedResponse(event, "FAILED", message)if ReqStatus == "FAILED":return None;## Sagemaker Security group ##print("Security GroupId Fetching.....")try:sgName = env_name+"-ResourceSG"response = ec2client.describe_security_groups(Filters=[{'Name': 'tag:Name','Values': [security_group_name]},])sgId = response['SecurityGroups'][0]['GroupId']input_dict['SecurityGroupIds'] = [sgId]print("Desc sg done!!")except Exception as e:print(e)ReqStatus = "FAILED"message = "Security Group ID Error: "+str(e)CFTFailedResponse(event, "FAILED", message)if ReqStatus == "FAILED":return None; try:if kmsKeyId:input_dict['KmsKeyId'] = kmsKeyIdelse:print("in else")print(input_dict)instance = client.create_notebook_instance(**input_dict)print('Sagemager CLI response')print(str(instance))responseData = {'NotebookInstanceArn': instance['NotebookInstanceArn']}NotebookStatus = 'Pending'response = client.describe_notebook_instance(NotebookInstanceName=event['ResourceProperties']['NotebookInstanceName'])NotebookStatus = response['NotebookInstanceStatus']print("NotebookStatus:"+NotebookStatus)## Notebook Failure ##if NotebookStatus == 'Failed':message = NotebookStatus+": "+response['FailureReason']+" :Notebook is not coming InService"CFTFailedResponse(event, "FAILED", message)else:while NotebookStatus == 'Pending':time.sleep(200)response = client.describe_notebook_instance(NotebookInstanceName=event['ResourceProperties']['NotebookInstanceName'])NotebookStatus = response['NotebookInstanceStatus']print("NotebookStatus in loop:"+NotebookStatus)## Notebook Success ##if NotebookStatus == 'InService':data['Message'] = "SageMaker Notebook name - "+event['ResourceProperties']['NotebookInstanceName']+" created succesfully"print("message InService :",data['Message'])CFTSuccessResponse(event, "SUCCESS", data)else:message = NotebookStatus+": "+response['FailureReason']+" :Notebook is not coming InService"print("message :",message)CFTFailedResponse(event, "FAILED", message)except Exception as e:print(e)ReqStatus = "FAILED"CFTFailedResponse(event, "FAILED", str(e))if event['RequestType'] == 'Delete':NotebookStatus = Nonelifecycle_config = event['ResourceProperties']['LifecycleConfigName']NotebookName = event['ResourceProperties']['NotebookInstanceName']try:response = client.describe_notebook_instance(NotebookInstanceName=NotebookName)NotebookStatus = response['NotebookInstanceStatus']print("Notebook Status - "+NotebookStatus)except Exception as e:print(e)NotebookStatus = "Invalid"#CFTFailedResponse(event, "FAILED", str(e))while NotebookStatus == 'Pending':time.sleep(30)response = client.describe_notebook_instance(NotebookInstanceName=NotebookName)NotebookStatus = response['NotebookInstanceStatus']print("NotebookStatus:"+NotebookStatus)if NotebookStatus != 'Failed' and NotebookStatus != 'Invalid' :print("Delete request for Notebookk name: "+NotebookName)print("Stoping the Notebook.....")if NotebookStatus != 'Stopped':try:response = client.stop_notebook_instance(NotebookInstanceName=NotebookName)NotebookStatus = 'Stopping'print("Notebook Status - "+NotebookStatus)while NotebookStatus == 'Stopping':time.sleep(30)response = client.describe_notebook_instance(NotebookInstanceName=NotebookName)NotebookStatus = response['NotebookInstanceStatus']print("NotebookStatus:"+NotebookStatus)except Exception as e:print(e)NotebookStatus = "Invalid"CFTFailedResponse(event, "FAILED", str(e))else:NotebookStatus = 'Stopped'print("NotebookStatus:"+NotebookStatus)if NotebookStatus != 'Invalid':print("Deleting The Notebook......")time.sleep(5)try:response = client.delete_notebook_instance(NotebookInstanceName=NotebookName)print("Notebook Deleted")data["Message"] = "Notebook Deleted"CFTSuccessResponse(event, "SUCCESS", data)except Exception as e:print(e)CFTFailedResponse(event, "FAILED", str(e))else:print("Notebook Invalid status")data["Message"] = "Notebook is not available"CFTSuccessResponse(event, "SUCCESS", data)if event['RequestType'] == 'Update':print("Update operation for Sagemaker Notebook is not recommended")data["Message"] = "Update operation for Sagemaker Notebook is not recommended"CFTSuccessResponse(event, "SUCCESS", data)
2. 接下来我们创建一个yaml脚本,复制以下代码,上传到S3桶中,用于通过CloudFormation,以IaC的形式创建SageMaker Jupyter Notebook。
AWSTemplateFormatVersion: 2010-09-09
Description: Template to create a SageMaker notebook
Metadata:'AWS::CloudFormation::Interface':ParameterGroups:- Label:default: Environment detailParameters:- ENVName- Label:default: SageMaker Notebook configurationParameters:- NotebookInstanceName- NotebookInstanceType- DirectInternetAccess- RootAccess- VolumeSizeInGB- Label:default: Load S3 Bucket to SageMakerParameters:- S3CodePusher- CodeBucketName- Label:default: Project detailParameters:- ProjectName- ProjectIDParameterLabels:DirectInternetAccess:default: Default Internet AccessNotebookInstanceName:default: Notebook Instance NameNotebookInstanceType:default: Notebook Instance TypeENVName:default: Environment NameProjectName:default: Project SuffixRootAccess:default: Root accessVolumeSizeInGB:default: Volume size for the SageMaker NotebookProjectID:default: SageMaker ProjectIDCodeBucketName:default: Code Bucket Name S3CodePusher:default: Copy code from S3 to SageMaker
Parameters:SubnetName:Default: ProSM-ResourceSubnetDescription: Subnet Random StringType: StringSecurityGroupName:Default: ProSM-ResourceSGDescription: Security Group NameType: StringSageMakerBuildFunctionARN:Description: Service Token Value passed from Lambda StackType: StringNotebookInstanceName:AllowedPattern: '[A-Za-z0-9-]{1,63}'ConstraintDescription: >-Maximum of 63 alphanumeric characters. Can include hyphens (-), but notspaces. Must be unique within your account in an AWS Region.Description: SageMaker Notebook instance nameMaxLength: '63'MinLength: '1'Type: StringNotebookInstanceType:ConstraintDescription: Must select a valid notebook instance type.Default: ml.t3.mediumDescription: Select Instance type for the SageMaker NotebookType: StringENVName:Description: SageMaker infrastructure naming conventionType: StringProjectName:Description: >-The suffix appended to all resources in the stack. This will allowmultiple copies of the same stack to be created in the same account.Type: StringRootAccess:Description: Root access for the SageMaker Notebook userAllowedValues:- Enabled- DisabledDefault: EnabledType: StringVolumeSizeInGB:Description: >-The size, in GB, of the ML storage volume to attach to the notebookinstance. The default value is 5 GB.Type: NumberDefault: '20'DirectInternetAccess:Description: >-If you set this to Disabled this notebook instance will be able to accessresources only in your VPC. As per the Project requirement, we haveDisabled it.Type: StringDefault: DisabledAllowedValues:- DisabledConstraintDescription: Must select a valid notebook instance type.ProjectID:Type: StringDescription: Enter a valid ProjectID.Default: QuickStart007S3CodePusher:Description: Do you want to load the code from S3 to SageMaker NotebookDefault: 'NO'AllowedValues:- 'YES'- 'NO'Type: StringCodeBucketName:Description: S3 Bucket name from which you want to copy the code to SageMaker.Default: lab-materials-bucket-1234Type: String
Conditions:BucketCondition: !Equals - 'YES'- !Ref S3CodePusher
Resources:SagemakerKMSKey:Type: 'AWS::KMS::Key'Properties:EnableKeyRotation: trueTags:- Key: ProjectIDValue: !Ref ProjectID- Key: ProjectNameValue: !Ref ProjectNameKeyPolicy:Version: '2012-10-17'Statement:- Effect: AllowPrincipal:AWS: !Sub 'arn:aws:iam::${AWS::AccountId}:root'Action: - 'kms:Encrypt'- 'kms:PutKeyPolicy' - 'kms:CreateKey' - 'kms:GetKeyRotationStatus' - 'kms:DeleteImportedKeyMaterial' - 'kms:GetKeyPolicy' - 'kms:UpdateCustomKeyStore' - 'kms:GenerateRandom' - 'kms:UpdateAlias'- 'kms:ImportKeyMaterial'- 'kms:ListRetirableGrants' - 'kms:CreateGrant' - 'kms:DeleteAlias'- 'kms:RetireGrant'- 'kms:ScheduleKeyDeletion' - 'kms:DisableKeyRotation' - 'kms:TagResource' - 'kms:CreateAlias' - 'kms:EnableKeyRotation' - 'kms:DisableKey'- 'kms:ListResourceTags'- 'kms:Verify' - 'kms:DeleteCustomKeyStore'- 'kms:Sign' - 'kms:ListKeys'- 'kms:ListGrants'- 'kms:ListAliases' - 'kms:ReEncryptTo' - 'kms:UntagResource' - 'kms:GetParametersForImport'- 'kms:ListKeyPolicies'- 'kms:GenerateDataKeyPair'- 'kms:GenerateDataKeyPairWithoutPlaintext' - 'kms:GetPublicKey' - 'kms:Decrypt' - 'kms:ReEncryptFrom'- 'kms:DisconnectCustomKeyStore' - 'kms:DescribeKey'- 'kms:GenerateDataKeyWithoutPlaintext'- 'kms:DescribeCustomKeyStores' - 'kms:CreateCustomKeyStore'- 'kms:EnableKey'- 'kms:RevokeGrant'- 'kms:UpdateKeyDescription' - 'kms:ConnectCustomKeyStore' - 'kms:CancelKeyDeletion' - 'kms:GenerateDataKey'Resource:- !Join - ''- - 'arn:aws:kms:'- !Ref 'AWS::Region'- ':'- !Ref 'AWS::AccountId'- ':key/*'- Sid: Allow access for Key AdministratorsEffect: AllowPrincipal:AWS: - !GetAtt SageMakerExecutionRole.ArnAction:- 'kms:CreateAlias'- 'kms:CreateKey'- 'kms:CreateGrant' - 'kms:CreateCustomKeyStore'- 'kms:DescribeKey'- 'kms:DescribeCustomKeyStores'- 'kms:EnableKey'- 'kms:EnableKeyRotation'- 'kms:ListKeys'- 'kms:ListAliases'- 'kms:ListKeyPolicies'- 'kms:ListGrants'- 'kms:ListRetirableGrants'- 'kms:ListResourceTags'- 'kms:PutKeyPolicy'- 'kms:UpdateAlias'- 'kms:UpdateKeyDescription'- 'kms:UpdateCustomKeyStore'- 'kms:RevokeGrant'- 'kms:DisableKey'- 'kms:DisableKeyRotation'- 'kms:GetPublicKey'- 'kms:GetKeyRotationStatus'- 'kms:GetKeyPolicy'- 'kms:GetParametersForImport'- 'kms:DeleteCustomKeyStore'- 'kms:DeleteImportedKeyMaterial'- 'kms:DeleteAlias'- 'kms:TagResource'- 'kms:UntagResource'- 'kms:ScheduleKeyDeletion'- 'kms:CancelKeyDeletion'Resource:- !Join - ''- - 'arn:aws:kms:'- !Ref 'AWS::Region'- ':'- !Ref 'AWS::AccountId'- ':key/*'- Sid: Allow use of the keyEffect: AllowPrincipal:AWS: - !GetAtt SageMakerExecutionRole.ArnAction:- kms:Encrypt- kms:Decrypt- kms:ReEncryptTo- kms:ReEncryptFrom- kms:GenerateDataKeyPair- kms:GenerateDataKeyPairWithoutPlaintext- kms:GenerateDataKeyWithoutPlaintext- kms:GenerateDataKey- kms:DescribeKeyResource:- !Join - ''- - 'arn:aws:kms:'- !Ref 'AWS::Region'- ':'- !Ref 'AWS::AccountId'- ':key/*'- Sid: Allow attachment of persistent resourcesEffect: AllowPrincipal:AWS: - !GetAtt SageMakerExecutionRole.ArnAction:- kms:CreateGrant- kms:ListGrants- kms:RevokeGrantResource:- !Join - ''- - 'arn:aws:kms:'- !Ref 'AWS::Region'- ':'- !Ref 'AWS::AccountId'- ':key/*'Condition:Bool:kms:GrantIsForAWSResource: 'true'KeyAlias:Type: AWS::KMS::AliasProperties:AliasName: 'alias/SageMaker-CMK-DS'TargetKeyId:Ref: SagemakerKMSKeySageMakerExecutionRole:Type: 'AWS::IAM::Role'Properties:Tags:- Key: ProjectIDValue: !Ref ProjectID- Key: ProjectNameValue: !Ref ProjectNameAssumeRolePolicyDocument:Statement:- Effect: AllowPrincipal:Service:- sagemaker.amazonaws.comAction:- 'sts:AssumeRole'Path: /Policies:- PolicyName: !Join - ''- - !Ref ProjectName- SageMakerExecutionPolicyPolicyDocument:Version: 2012-10-17Statement:- Effect: AllowAction:- 'iam:ListRoles'Resource:- !Join - ''- - 'arn:aws:iam::'- !Ref 'AWS::AccountId'- ':role/*'- Sid: CloudArnResourceEffect: AllowAction:- 'application-autoscaling:DeleteScalingPolicy'- 'application-autoscaling:DeleteScheduledAction'- 'application-autoscaling:DeregisterScalableTarget'- 'application-autoscaling:DescribeScalableTargets'- 'application-autoscaling:DescribeScalingActivities'- 'application-autoscaling:DescribeScalingPolicies'- 'application-autoscaling:DescribeScheduledActions'- 'application-autoscaling:PutScalingPolicy'- 'application-autoscaling:PutScheduledAction'- 'application-autoscaling:RegisterScalableTarget'Resource:- !Join - ''- - 'arn:aws:autoscaling:'- !Ref 'AWS::Region'- ':'- !Ref 'AWS::AccountId'- ':*'- Sid: ElasticArnResourceEffect: AllowAction:- 'elastic-inference:Connect'Resource:- !Join - ''- - 'arn:aws:elastic-inference:'- !Ref 'AWS::Region'- ':'- !Ref 'AWS::AccountId'- ':elastic-inference-accelerator/*' - Sid: SNSArnResourceEffect: AllowAction:- 'sns:ListTopics'Resource:- !Join - ''- - 'arn:aws:sns:'- !Ref 'AWS::Region'- ':'- !Ref 'AWS::AccountId'- ':*'- Sid: logsArnResourceEffect: AllowAction:- 'cloudwatch:DeleteAlarms'- 'cloudwatch:DescribeAlarms'- 'cloudwatch:GetMetricData'- 'cloudwatch:GetMetricStatistics'- 'cloudwatch:ListMetrics'- 'cloudwatch:PutMetricAlarm'- 'cloudwatch:PutMetricData'- 'logs:CreateLogGroup'- 'logs:CreateLogStream'- 'logs:DescribeLogStreams'- 'logs:GetLogEvents'- 'logs:PutLogEvents'Resource:- !Join - ''- - 'arn:aws:logs:'- !Ref 'AWS::Region'- ':'- !Ref 'AWS::AccountId'- ':log-group:/aws/lambda/*'- Sid: KmsArnResourceEffect: AllowAction:- 'kms:DescribeKey'- 'kms:ListAliases'Resource:- !Join - ''- - 'arn:aws:kms:'- !Ref 'AWS::Region'- ':'- !Ref 'AWS::AccountId'- ':key/*'- Sid: ECRArnResourceEffect: AllowAction:- 'ecr:BatchCheckLayerAvailability'- 'ecr:BatchGetImage'- 'ecr:CreateRepository'- 'ecr:GetAuthorizationToken'- 'ecr:GetDownloadUrlForLayer'- 'ecr:DescribeRepositories'- 'ecr:DescribeImageScanFindings'- 'ecr:DescribeRegistry'- 'ecr:DescribeImages'Resource:- !Join - ''- - 'arn:aws:ecr:'- !Ref 'AWS::Region'- ':'- !Ref 'AWS::AccountId'- ':repository/*'- Sid: EC2ArnResourceEffect: AllowAction: - 'ec2:CreateNetworkInterface'- 'ec2:CreateNetworkInterfacePermission'- 'ec2:DeleteNetworkInterface'- 'ec2:DeleteNetworkInterfacePermission'- 'ec2:DescribeDhcpOptions'- 'ec2:DescribeNetworkInterfaces'- 'ec2:DescribeRouteTables'- 'ec2:DescribeSecurityGroups'- 'ec2:DescribeSubnets'- 'ec2:DescribeVpcEndpoints'- 'ec2:DescribeVpcs'Resource:- !Join - ''- - 'arn:aws:ec2:'- !Ref 'AWS::Region'- ':'- !Ref 'AWS::AccountId'- ':instance/*'- Sid: S3ArnResourceEffect: AllowAction: - 's3:CreateBucket'- 's3:GetBucketLocation'- 's3:ListBucket' Resource:- !Join - ''- - 'arn:aws:s3::'- ':*sagemaker*' - Sid: LambdaInvokePermissionEffect: AllowAction:- 'lambda:ListFunctions'Resource:- !Join - ''- - 'arn:aws:lambda:'- !Ref 'AWS::Region'- ':'- !Ref 'AWS::AccountId'- ':function'- ':*'- Effect: AllowAction: 'sagemaker:InvokeEndpoint'Resource:- !Join - ''- - 'arn:aws:sagemaker:'- !Ref 'AWS::Region'- ':'- !Ref 'AWS::AccountId'- ':notebook-instance-lifecycle-config/*'Condition:StringEquals:'aws:PrincipalTag/ProjectID': !Ref ProjectID- Effect: AllowAction:- 'sagemaker:CreateTrainingJob'- 'sagemaker:CreateEndpoint'- 'sagemaker:CreateModel'- 'sagemaker:CreateEndpointConfig'- 'sagemaker:CreateHyperParameterTuningJob'- 'sagemaker:CreateTransformJob'Resource:- !Join - ''- - 'arn:aws:sagemaker:'- !Ref 'AWS::Region'- ':'- !Ref 'AWS::AccountId'- ':notebook-instance-lifecycle-config/*'Condition:StringEquals:'aws:PrincipalTag/ProjectID': !Ref ProjectID'ForAllValues:StringEquals':'aws:TagKeys':- Username- Effect: AllowAction:- 'sagemaker:DescribeTrainingJob'- 'sagemaker:DescribeEndpoint'- 'sagemaker:DescribeEndpointConfig'Resource:- !Join - ''- - 'arn:aws:sagemaker:'- !Ref 'AWS::Region'- ':'- !Ref 'AWS::AccountId'- ':notebook-instance-lifecycle-config/*'Condition:StringEquals:'aws:PrincipalTag/ProjectID': !Ref ProjectID- Effect: AllowAction:- 'sagemaker:DeleteTags'- 'sagemaker:ListTags'- 'sagemaker:DescribeNotebookInstance'- 'sagemaker:ListNotebookInstanceLifecycleConfigs'- 'sagemaker:DescribeModel'- 'sagemaker:ListTrainingJobs'- 'sagemaker:DescribeHyperParameterTuningJob'- 'sagemaker:UpdateEndpointWeightsAndCapacities'- 'sagemaker:ListHyperParameterTuningJobs'- 'sagemaker:ListEndpointConfigs'- 'sagemaker:DescribeNotebookInstanceLifecycleConfig'- 'sagemaker:ListTrainingJobsForHyperParameterTuningJob'- 'sagemaker:StopHyperParameterTuningJob'- 'sagemaker:DescribeEndpointConfig'- 'sagemaker:ListModels'- 'sagemaker:AddTags'- 'sagemaker:ListNotebookInstances'- 'sagemaker:StopTrainingJob'- 'sagemaker:ListEndpoints'- 'sagemaker:DeleteEndpoint'Resource:- !Join - ''- - 'arn:aws:sagemaker:'- !Ref 'AWS::Region'- ':'- !Ref 'AWS::AccountId'- ':notebook-instance-lifecycle-config/*'Condition:StringEquals:'aws:PrincipalTag/ProjectID': !Ref ProjectID- Effect: AllowAction:- 'ecr:SetRepositoryPolicy'- 'ecr:CompleteLayerUpload'- 'ecr:BatchDeleteImage'- 'ecr:UploadLayerPart'- 'ecr:DeleteRepositoryPolicy'- 'ecr:InitiateLayerUpload'- 'ecr:DeleteRepository'- 'ecr:PutImage'Resource: - !Join - ''- - 'arn:aws:ecr:'- !Ref 'AWS::Region'- ':'- !Ref 'AWS::AccountId'- ':repository/*sagemaker*'- Effect: AllowAction:- 's3:GetObject'- 's3:ListBucket'- 's3:PutObject'- 's3:DeleteObject'Resource:- !Join - ''- - 'arn:aws:s3:::'- !Ref SagemakerS3Bucket- !Join - ''- - 'arn:aws:s3:::'- !Ref SagemakerS3Bucket- /*Condition:StringEquals:'aws:PrincipalTag/ProjectID': !Ref ProjectID- Effect: AllowAction: 'iam:PassRole'Resource:- !Join - ''- - 'arn:aws:iam::'- !Ref 'AWS::AccountId'- ':role/*'Condition:StringEquals:'iam:PassedToService': sagemaker.amazonaws.comCodeBucketPolicy:Type: 'AWS::IAM::Policy'Condition: BucketConditionProperties:PolicyName: !Join - ''- - !Ref ProjectName- CodeBucketPolicyPolicyDocument:Version: 2012-10-17Statement:- Effect: AllowAction:- 's3:GetObject'Resource:- !Join - ''- - 'arn:aws:s3:::'- !Ref CodeBucketName- !Join - ''- - 'arn:aws:s3:::'- !Ref CodeBucketName- '/*'Roles:- !Ref SageMakerExecutionRoleSagemakerS3Bucket:Type: 'AWS::S3::Bucket'Properties:BucketEncryption:ServerSideEncryptionConfiguration:- ServerSideEncryptionByDefault:SSEAlgorithm: AES256Tags:- Key: ProjectIDValue: !Ref ProjectID- Key: ProjectNameValue: !Ref ProjectNameS3Policy:Type: 'AWS::S3::BucketPolicy'Properties:Bucket: !Ref SagemakerS3BucketPolicyDocument:Version: 2012-10-17Statement:- Sid: AllowAccessFromVPCEndpointEffect: AllowPrincipal: "*"Action:- 's3:Get*'- 's3:Put*'- 's3:List*'- 's3:DeleteObject'Resource:- !Join - ''- - 'arn:aws:s3:::'- !Ref SagemakerS3Bucket- !Join - ''- - 'arn:aws:s3:::'- !Ref SagemakerS3Bucket- '/*'Condition:StringEquals:"aws:sourceVpce": "<PASTE S3 VPC ENDPOINT ID>"EFSLifecycleConfig:Type: 'AWS::SageMaker::NotebookInstanceLifecycleConfig'Properties:NotebookInstanceLifecycleConfigName: 'Provisioned-LC'OnCreate:- Content: !Base64 'Fn::Join':- ''- - |#!/bin/bash - |aws configure set sts_regional_endpoints regional - yes | cp -rf ~/.aws/config /home/ec2-user/.aws/configOnStart:- Content: !Base64 'Fn::Join':- ''- - |#!/bin/bash - |aws configure set sts_regional_endpoints regional - yes | cp -rf ~/.aws/config /home/ec2-user/.aws/config EFSLifecycleConfigForS3:Type: 'AWS::SageMaker::NotebookInstanceLifecycleConfig'Properties:NotebookInstanceLifecycleConfigName: 'Provisioned-LC-S3'OnCreate:- Content: !Base64 'Fn::Join':- ''- - |#!/bin/bash - |# Copy Content- !Sub >aws s3 cp s3://${CodeBucketName} /home/ec2-user/SageMaker/ --recursive - |# Set sts endpoint- >aws configure set sts_regional_endpoints regional - yes | cp -rf ~/.aws/config /home/ec2-user/.aws/configOnStart:- Content: !Base64 'Fn::Join':- ''- - |#!/bin/bash - |aws configure set sts_regional_endpoints regional - yes | cp -rf ~/.aws/config /home/ec2-user/.aws/config SageMakerCustomResource:Type: 'Custom::SageMakerCustomResource'DependsOn: S3PolicyProperties:ServiceToken: !Ref SageMakerBuildFunctionARNNotebookInstanceName: !Ref NotebookInstanceNameNotebookInstanceType: !Ref NotebookInstanceTypeKmsKeyId: !Ref SagemakerKMSKeyENVName: !Join - ''- - !Ref ENVName- !Sub Subnet1IdSubnet: !Ref SubnetNameSecurityGroupName: !Ref SecurityGroupNameProjectName: !Ref ProjectNameRootAccess: !Ref RootAccessVolumeSizeInGB: !Ref VolumeSizeInGBLifecycleConfigName: !If [BucketCondition, !GetAtt EFSLifecycleConfigForS3.NotebookInstanceLifecycleConfigName, !GetAtt EFSLifecycleConfig.NotebookInstanceLifecycleConfigName] DirectInternetAccess: !Ref DirectInternetAccessRoleArn: !GetAtt - SageMakerExecutionRole- ArnTags:- Key: ProjectIDValue: !Ref ProjectID- Key: ProjectNameValue: !Ref ProjectName
Outputs:Message:Description: Execution StatusValue: !GetAtt - SageMakerCustomResource- MessageSagemakerKMSKey:Description: KMS Key for encrypting Sagemaker resourceValue: !Ref KeyAliasExecutionRoleArn:Description: ARN of the Sagemaker Execution RoleValue: !Ref SageMakerExecutionRoleS3BucketName:Description: S3 bucket for SageMaker Notebook operationValue: !Ref SagemakerS3BucketNotebookInstanceName:Description: Name of the Sagemaker Notebook instance createdValue: !Ref NotebookInstanceNameProjectName:Description: Project ID used for SageMaker deploymentValue: !Ref ProjectNameProjectID:Description: Project ID used for SageMaker deploymentValue: !Ref ProjectID
3. 接下来我们进入VPC服务主页,进入Endpoint功能,点击Create endpoint创建一个VPC endpoint节点,用于SageMaker私密安全的访问S3桶中的大模型文件。

4. 为节点命名为“s3-endpoint”,并选择节点访问对象类型为AWS service,选择s3作为访问服务。

5. 选择节点所在的VPC,并配置路由表,最后点击创建。

6. 接下来我们进入亚马逊云科技service catalog服务主页,进入Portfolio功能,点击create创建一个新的portfolio,用于统一管理一整个包括不同云资源的服务。

7. 为service portfolio起名“SageMakerPortfolio“,所有者选为CQ。

8. 接下来我们为Portfolio添加云资源,点击"create product"

9. 我们选择通过CloudFormation IaC脚本的形式创建Product云资源,为Product其名为”SageMakerProduct“,所有者设置为CQ。

10. 在Product中添加CloudFormation脚本文件,我们通过URL的形式,将我们在第二步上传到S3中的CloudFormation脚本URL填入,并设置版本为1,最后点击Create创建Product云资源。

11.接下来我们进入到Constraints页面,点击create创建Constraints,用于通过权限管理限制利用Service Catalog Product对云资源的操作。

12. 选择限制我们刚刚创建的的Product: "SageMakerProduct",选择限制的类型为创建。

13. 为限制添加IAM角色规则,IAM角色中配置了对Product权限管理规则,再点击Create创建。

14. 接下来我们点击Access,创建一个Access来限制可以访问Product云资源的用户。

15. 我们添加了角色”SCEndUserRole“,用户代替用户访问Product创建云资源。

16. 接下来我们开始利用Service Catalog Product创建一些列的云资源。选中我们刚创建的Product,点击Launch

17. 为我们要创建的云资源Product起一个名字”DataScientistProduct“, 选择我们前一步创建的版本号1。

18. 为将要通过Product创建的SageMaker配置参数,环境名以及实例名

19. 添加我们在最开始创建的Lambda函数ARN ID,点击Launch开始创建。

20. 最后回到SageMaker服务主页,可以看到我们利用Service Catalog Product功能成功创建了一个新的Jupyter Notebook实例。利用这个实例,我们就可以开发我们的AI服务应用。

以上就是在亚马逊云科技上利用亚马逊云科技安全、合规地训练AI大模型和开发AI应用全部步骤。欢迎大家未来与我一起,未来获取更多国际前沿的生成式AI开发方案。
相关文章:
在亚马逊云科技上安全、合规地创建AI大模型训练基础设施并开发AI应用服务
项目简介: 小李哥将继续每天介绍一个基于亚马逊云科技AWS云计算平台的全球前沿AI技术解决方案,帮助大家快速了解国际上最热门的云计算平台亚马逊云科技AWS AI最佳实践,并应用到自己的日常工作里。 本次介绍的是如何在亚马逊云科技利用Servi…...
无人机模拟训练室技术详解
无人机模拟训练室作为现代无人机技术培训的重要组成部分,集成了高精度模拟技术、先进的数据处理能力及高度交互的操作界面,为无人机操作员提供了一个安全、高效、接近实战的训练环境。以下是对无人机模拟训练室技术的详细解析,涵盖系统基础概…...
【Spring框架】
一、引言二、Spring核心概念三、Spring入门示例四、进一步了解Spring的依赖注入五、Spring的面向切面编程(AOP)六、总结 一、引言 Spring框架自2003年发布以来,凭借其轻量级、易于扩展的特性,在Java企业级应用开发领域得到了广泛…...
uniapp 日常业务 随便写写 源码
现成的组件 直接用 <template><view style"margin: 10rpx;"><view class"tea-header"><text class"tea-title">礼尚往来</text><view class"tea-view-all"><text>查看全部</text>&l…...
【软件测试】单元测试20套练习题
(一)概述 使用Java语言编写应用程序,设计测试数据,完成指定要求的白盒测试,对测试数据及相应测试结果进行界面截图,将代码以及相关截图粘贴到白盒测试报告中。 (二)题目要求...
8.16 day bug
bug1 题目没看仔细 额外知识 在 Bash shell 中,! 符号用于历史扩展功能。当你在命令行中输入 ! 后跟一些文本时,Bash 会尝试从你的命令历史中查找与该文本相匹配的命令。这是一种快速重用之前执行过的命令的方法。 如何使用历史扩展 基本用法: !strin…...
《Nginx核心技术》第11章:实现MySQL数据库的负载均衡
作者:冰河 星球:http://m6z.cn/6aeFbs 博客:https://binghe.gitcode.host 文章汇总:https://binghe.gitcode.host/md/all/all.html 星球项目地址:https://binghe.gitcode.host/md/zsxq/introduce.html 沉淀,…...
使用 Gnosis Safe 创建多签名钱包
创建多签名钱包可以通过多个步骤完成,具体取决于你使用的平台或工具。下面我将介绍使用 Gnosis Safe 创建多签名钱包的过程,因为它是目前以太坊生态中最受欢迎且功能强大的多签名钱包之一。 目录 使用 Gnosis Safe 创建多签名钱包1. 准备工作2. 访问 Gnosis Safe3. 创建多签名…...
LeetCode 算法:前 K 个高频元素 c++
原题链接🔗:前 K 个高频元素 难度:中等⭐️⭐️ 题目 给你一个整数数组 nums 和一个整数 k ,请你返回其中出现频率前 k 高的元素。你可以按 任意顺序 返回答案。 示例 1: 输入: nums [1,1,1,2,2,3], k 2 输出: [1,2] 示例 2…...
MySQL的SQL语句更新某个字段的值在原来值的基础上随机增加100~600
要在 MySQL 中更新某个字段的值,使其在原有值的基础上随机增加一个 100 到 600 之间的值,你可以使用 RAND() 函数来生成随机数,并结合其他 SQL 函数进行计算。以下是一个 SQL 更新语句的示例: UPDATE your_table_name SET your…...
LeetCode --- 410周赛
题目列表 3248. 矩阵中的蛇 3249. 统计好节点的数目 3250. 单调数组对的数目 I 3251. 单调数组对的数目 II 一、矩阵中的蛇 只要按照题目要求模拟即可,代码如下 class Solution { public:int finalPositionOfSnake(int n, vector<string>& commands…...
最佳的iPhone解锁软件和应用程序
在探讨最佳的iPhone解锁软件和应用程序时,我们需要考虑多个方面,包括软件的解锁能力、易用性、安全性、兼容性以及用户评价等。以下是对当前市场上几款优秀iPhone解锁软件和应用程序的详细分析,旨在为用户提供全面而深入的指导。 一、奇客iO…...
初等函数和它的表达式
常量函数,幂函数,指数函数,对数函数,三角函数和反三角函数成为基本初等函数。基本初等函数经过有限四则运算和符合运算得到的函数称为初等函数。 1. 常量函数 表达式: (其中 c 是常数)参数的意…...
Android 12系统源码_多屏幕(二)模拟辅助设备功能开关实现原理
前言 上一篇我们通过为Android系统开启模拟辅助设备功能开关,最终实现了将一个Activity显示到多个屏幕的效果。 本篇文章我们具体来分析一下当我们开启模拟辅助设备功能开关的时候,Android系统做了什么哪些操作。 一、模拟辅助设备功能开关应用位置 …...
【Go语言初探】(二)、项目文件结构和GOPATH设置
一、go语言项目文件结构 由go/bin、go/src和go/pkg三个子文件夹组成,见下图: 实际项目: 二、gopath路径变量设置 在项目中创建main.go文件后,IDE会提示设置GOPATH路径: 点击“configure GOPATH”,设置GOP…...
三种简单排序:插入排序、冒泡排序与选择排序 【算法 05】
三种简单排序:插入排序、冒泡排序与选择排序 在编程中,排序算法是基础且重要的知识点。虽然在实际开发中,我们可能会直接使用标准库中的排序函数(如C的std::sort),但了解并实现这些基础排序算法对于理解算法…...
Python -- GUI图形界面编程—GUI编程实例 博主也在持续学习中[ 持续更新中!!! 欢迎白嫖 也求粉啊啊啊~ ]
本文介绍了GUI的图形界面编程(相关视频是哔站上的应该搜这个题目就能找到),文章还是很基础的,反正我是小白从0开始,主要的结构tinkter库、重要组件简介(这个不用死记硬背 用的时候再说)、Label&…...
Vue2和Vue3中的diff算法
提示:文章写完后,目录可以自动生成,如何生成可参考右边的帮助文档 文章目录 前言一、diff算法是什么?二、vue2中的diff算法三、vue3中的diff算法总结 前言 一、diff算法是什么? diff算法很早就存在了,一开…...
springboot使用aop或Jackson进行数据脱敏
1.aop 启动类加EnableAspectJAutoProxy 自定义注解,在实体类中使用表示被脱敏字段 建立aop切面类 可能这里gpt会建议你用Pointcut("execution(public * com.xx.aop..*.get*(..))")这种方式拦截,这种我试了,拦截不住。猜测在mvc返…...
【Solidity】基础介绍
数据类型 值类型 值类型的变量在赋值或作为函数参数传递时会被复制。 布尔类型:bool整数类型: 无符号:uint8、uint16、…、uint256 (uint256 可简写为 uint)有符号:int8、int16、…、int256 (int256可简写为 int) 地址类型&…...
浅谈 React Hooks
React Hooks 是 React 16.8 引入的一组 API,用于在函数组件中使用 state 和其他 React 特性(例如生命周期方法、context 等)。Hooks 通过简洁的函数接口,解决了状态与 UI 的高度解耦,通过函数式编程范式实现更灵活 Rea…...
云原生核心技术 (7/12): K8s 核心概念白话解读(上):Pod 和 Deployment 究竟是什么?
大家好,欢迎来到《云原生核心技术》系列的第七篇! 在上一篇,我们成功地使用 Minikube 或 kind 在自己的电脑上搭建起了一个迷你但功能完备的 Kubernetes 集群。现在,我们就像一个拥有了一块崭新数字土地的农场主,是时…...
JavaScript 中的 ES|QL:利用 Apache Arrow 工具
作者:来自 Elastic Jeffrey Rengifo 学习如何将 ES|QL 与 JavaScript 的 Apache Arrow 客户端工具一起使用。 想获得 Elastic 认证吗?了解下一期 Elasticsearch Engineer 培训的时间吧! Elasticsearch 拥有众多新功能,助你为自己…...
Keil 中设置 STM32 Flash 和 RAM 地址详解
文章目录 Keil 中设置 STM32 Flash 和 RAM 地址详解一、Flash 和 RAM 配置界面(Target 选项卡)1. IROM1(用于配置 Flash)2. IRAM1(用于配置 RAM)二、链接器设置界面(Linker 选项卡)1. 勾选“Use Memory Layout from Target Dialog”2. 查看链接器参数(如果没有勾选上面…...
(转)什么是DockerCompose?它有什么作用?
一、什么是DockerCompose? DockerCompose可以基于Compose文件帮我们快速的部署分布式应用,而无需手动一个个创建和运行容器。 Compose文件是一个文本文件,通过指令定义集群中的每个容器如何运行。 DockerCompose就是把DockerFile转换成指令去运行。 …...
Mysql8 忘记密码重置,以及问题解决
1.使用免密登录 找到配置MySQL文件,我的文件路径是/etc/mysql/my.cnf,有的人的是/etc/mysql/mysql.cnf 在里最后加入 skip-grant-tables重启MySQL服务 service mysql restartShutting down MySQL… SUCCESS! Starting MySQL… SUCCESS! 重启成功 2.登…...
Selenium常用函数介绍
目录 一,元素定位 1.1 cssSeector 1.2 xpath 二,操作测试对象 三,窗口 3.1 案例 3.2 窗口切换 3.3 窗口大小 3.4 屏幕截图 3.5 关闭窗口 四,弹窗 五,等待 六,导航 七,文件上传 …...
2025年渗透测试面试题总结-腾讯[实习]科恩实验室-安全工程师(题目+回答)
安全领域各种资源,学习文档,以及工具分享、前沿信息分享、POC、EXP分享。不定期分享各种好玩的项目及好用的工具,欢迎关注。 目录 腾讯[实习]科恩实验室-安全工程师 一、网络与协议 1. TCP三次握手 2. SYN扫描原理 3. HTTPS证书机制 二…...
【JavaSE】多线程基础学习笔记
多线程基础 -线程相关概念 程序(Program) 是为完成特定任务、用某种语言编写的一组指令的集合简单的说:就是我们写的代码 进程 进程是指运行中的程序,比如我们使用QQ,就启动了一个进程,操作系统就会为该进程分配内存…...
MySQL 8.0 事务全面讲解
以下是一个结合两次回答的 MySQL 8.0 事务全面讲解,涵盖了事务的核心概念、操作示例、失败回滚、隔离级别、事务性 DDL 和 XA 事务等内容,并修正了查看隔离级别的命令。 MySQL 8.0 事务全面讲解 一、事务的核心概念(ACID) 事务是…...
